DJ메탈짱™의 Free Style

[C#] HTML 태그 제거하기 본문

일(job)/MS(Microsoft)

[C#] HTML 태그 제거하기

뽀&쏭 2015. 12. 8. 12:22


private string StripHTML(string htmlString)
{
   //This pattern Matches everything found inside html tags;
   //(.|\n) - > Look for any character or a new line
   // *?  -> 0 or more occurences, and make a non-greedy search meaning
   //That the match will stop at the first available '>' it sees, and not at the last one
   //(if it stopped at the last one we could have overlooked 
   //nested HTML tags inside a bigger HTML tag..)
   // Thanks to Oisin and Hugh Brown for helping on this one...
   
string pattern = @"<(.|\n)*?>";

   return  Regex.Replace(htmlString,pattern,string.Empty);
 }