5
votes

How can I encode a string in swift to remove all special characters and replace it with its matching html number.

Lets say I have the following String:

var mystring = "This is my String & That's it."

and then replace the special characters with its html number

& = &
' = '
> = >

But I want to do this for all Special Characters not just the ones listed in the string above. How would this be done?

5

5 Answers

6
votes
extension String {
    func makeHTMLfriendly() -> String {
        var finalString = ""
        for char in self {
            for scalar in String(char).unicodeScalars {
                finalString.append("&#\(scalar.value)")
            }
        }
        return finalString
    }
}

Usage:

newString = oldString.makeHTMLfriendly()

This appears to work in general (although I don't know for sure that unicode scalars always match HTML numbers).

Note that it converts everything, even things like alphanumeric characters that don't really need to be converted. It probably wouldn't be too difficult to edit it to not convert some things.

2
votes

Try SwiftSoup

func testEscape()throws {
    let text = "Hello &<> Å å π 新 there ¾ © »"

    let escapedAscii = Entities.escape(text, OutputSettings().encoder(String.Encoding.ascii).escapeMode(Entities.EscapeMode.base))
    let escapedAsciiFull = Entities.escape(text, OutputSettings().charset(String.Encoding.ascii).escapeMode(Entities.EscapeMode.extended))
    let escapedAsciiXhtml = Entities.escape(text, OutputSettings().charset(String.Encoding.ascii).escapeMode(Entities.EscapeMode.xhtml))
    let escapedUtfFull = Entities.escape(text, OutputSettings().charset(String.Encoding.utf8).escapeMode(Entities.EscapeMode.extended))
    let escapedUtfMin = Entities.escape(text, OutputSettings().charset(String.Encoding.utf8).escapeMode(Entities.EscapeMode.xhtml))

    XCTAssertEqual("Hello &amp;&lt;&gt; &Aring; &aring; &#x3c0; &#x65b0; there &frac34; &copy; &raquo;", escapedAscii)
    XCTAssertEqual("Hello &amp;&lt;&gt; &angst; &aring; &pi; &#x65b0; there &frac34; &copy; &raquo;", escapedAsciiFull)
    XCTAssertEqual("Hello &amp;&lt;&gt; &#xc5; &#xe5; &#x3c0; &#x65b0; there &#xbe; &#xa9; &#xbb;", escapedAsciiXhtml)
    XCTAssertEqual("Hello &amp;&lt;&gt; Å å π 新 there ¾ © »", escapedUtfFull)
    XCTAssertEqual("Hello &amp;&lt;&gt; Å å π 新 there ¾ © »", escapedUtfMin)
    // odd that it's defined as aring in base but angst in full

    // round trip
    XCTAssertEqual(text, try Entities.unescape(escapedAscii))
    XCTAssertEqual(text, try Entities.unescape(escapedAsciiFull))
    XCTAssertEqual(text, try Entities.unescape(escapedAsciiXhtml))
    XCTAssertEqual(text, try Entities.unescape(escapedUtfFull))
    XCTAssertEqual(text, try Entities.unescape(escapedUtfMin))
}
0
votes

For a little more variety:

extension String {
    var htmlCompatibleDecimalEncoded: String {
        self.unicodeScalars.reduce(into: "") { partialResult, scalar in
            partialResult.append(
                scalar.properties.isPatternSyntax ? "&#\(scalar.value)" : .init(scalar)
            )
            // For percent encoded hex-replacements we could use this:
            // "%\(String(scalar.value, radix: 16, uppercase: true))"
            // But there is a more convenient method for that.
        }
    }
    
    func htmlCompatiblePercentEncoded(allowing allowedCharacters: CharacterSet = []) -> String {
        self.addingPercentEncoding(withAllowedCharacters: allowedCharacters) ?? self
    }
}

let my: String = #"This is my "String" & That's it / <or is it?>"#

my.htmlCompatibleDecimalEncoded                                 // "This is my &#34String&#34 &#38 That&#39s it &#47 &#60or is it&#63&#62"
my.htmlCompatiblePercentEncoded(allowing: .urlQueryAllowed)     // "This%20is%20my%20%22String%22%20&%20That's%20it%20/%20%3Cor%20is%20it?%3E"
my.htmlCompatiblePercentEncoded()                               // "%54%68%69%73%20%69%73%20%6D%79%20%22%53%74%72%69%6E%67%22%20%26%20%54%68%61%74%27%73%20%69%74%20%2F%20%3C%6F%72%20%69%73%20%69%74%3F%3E"
let allowed: CharacterSet = .whitespaces.union(.alphanumerics)
my.htmlCompatiblePercentEncoded(allowing: allowed)              // "This is my %22String%22 %26 That%27s it %2F %3Cor is it%3F%3E"

-1
votes

Check the all special characters in html:

http://www.ascii.cl/htmlcodes.htm

And you create a Util for parsing the characters:

like this:

import UIKit

class Util: NSObject {

func parseSpecialStrToHtmlStr(oriStr: String) -> String {

        var returnStr: String = oriStr


        returnStr = returnStr.replacingOccurrences(of: "&", with: "&#38")
        returnStr = returnStr.replacingOccurrences(of: "'", with: "&#39")
        returnStr = returnStr.replacingOccurrences(of: ">", with: "&#62")
        ...


        return returnStr
    }
}

Do it yourself, create yourself's functional device.


Edit

If you think its a huge work, you check this: https://github.com/adela-chang/StringExtensionHTML

-1
votes

This is the easiest way:

let encodedValue = yourValue.addingPercentEncoding(withAllowedCharacters: .urlHostAllowed)