LeetCode HTML Entity Parser

1410. HTML Entity Parser

HTML entity parser is the parser that takes HTML code as input and replace all the entities of the special characters by the characters itself.

The special characters and their entities for HTML are:

  • Quotation Mark: the entity is " and symbol character is ".
  • Single Quote Mark: the entity is ' and symbol character is '.
  • Ampersand: the entity is & and symbol character is &.
  • Greater Than Sign: the entity is > and symbol character is >.
  • Less Than Sign: the entity is &lt; and symbol character is <.
  • Slash: the entity is &frasl; and symbol character is /.

Given the input text string to the HTML parser, you have to implement the entity parser.

Return the text after replacing the entities by the special characters.

Example 1:

Input: text = "&amp; is an HTML entity but &ambassador; is not."
Output: "& is an HTML entity but &ambassador; is not."
Explanation: The parser will replace the &amp; entity by &

Example 2:

Input: text = "and I quote: &quot;...&quot;"
Output: "and I quote: \"...\""

Example 3:

Input: text = "Stay home! Practice on Leetcode :)"
Output: "Stay home! Practice on Leetcode :)"

Example 4:

Input: text = "x &gt; y &amp;&amp; x &lt; y is always false"
Output: "x > y && x < y is always false"

Example 5:

Input: text = "leetcode.com&frasl;problemset&frasl;all"
Output: "leetcode.com/problemset/all"

Constraints:

  • 1 <= text.length <= 10^5
  • The string may contain any possible characters out of all the 256 ASCII characters.

给定一个HTML字符串,要求把其中的特殊符号转换为其原来的表示。

简单题,遍历字符串,找到以&开头,以;结尾的子串,根据情况把它转换为原来的字符即可。请注意,有可能该子串不属于文中6种情况中的任何一种,此时不需要转义,直接原样拷贝。代码如下:

class Solution {
public:
	string entityParser(string text) {
		int n = text.size(), i = 0;
		string ans = "";
		while (i < n) {
			while (i < n&&text[i] != '&') {
				ans.push_back(text[i]);
				++i;
			}
			if (i >= n)break;
			int j = i + 1;
			while (j < n&&text[j] != ';')++j;
			string mark = text.substr(i, j - i + 1);
			if (mark == "&quot;")ans.push_back('\"');
			else if (mark == "&apos;")ans.push_back('\'');
			else if (mark == "&amp;")ans.push_back('&');
			else if (mark == "&gt;")ans.push_back('>');
			else if (mark == "&lt;")ans.push_back('<');
			else if (mark == "&frasl;")ans.push_back('/');
			else ans += mark;
			i = j + 1;
		}
		return ans;
	}
};

本代码提交AC,用时460MS。

Leave a Reply

Your email address will not be published. Required fields are marked *