I would like to extract "http://www.somewebsite.com/wanted.jpg" from the string below where alt
is set to "thumbnail"
, and avoid grabbing http://www.somewebsite.com/notwanted.jpg
:
<span>Some information here
<div>
<img src="http://www.somewebsite.com/notwanted.jpg" width="15" height="15" alt="emoticon">
<img src="http://www.somewebsite.com/wanted.jpg" alt="thumbnail">
</div>
</span>
What is the easiest way to do that?
With all the warnings about parsing html with regex, this C# regex will match the url you want:
(?<=src=")[^"]+(?="[^">]*?alt="thumbnail")
See demo.
To test it in C#:
var myRegex = new Regex("(?<=src=\")[^\"]+(?=\"[^\">]*?alt=\"thumbnail\")");
string resultString = myRegex.Match(s1).Value;
Console.WriteLine(resultString);
Output:
Explanation
(?<=src=")
asserts that what precedes is src="
[^"]+
matches all chars that are not a "
(that's what we want)(?="[^">]*?alt="thumbnail")
asserts that what follows is a quote, and any chars that are not a quote or a >
followed by `alt="thumbnail"Reference
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments