I have the following html source loaded in a UIWebView
I want to extract
text1
text2 text2
text3 text3 text3
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>1322170516271</title>
<meta name="viewport" content="initial-scale=1.0, user-scalable=1, minimum-scale=1.0, maximum-scale=4.0">
<style type="text/css">
body
{
padding: 5px;
margin: 0px;
font-family: Helvetica, Arial;
font-size: 12pt;
background-color: #efefef;
background-image: url(ArticleBackground.jpg);
background-position: cover;
color: #000000;
}
h1
{
text-align: center;
border-bottom: 1px dotted #805050;
font-size: 28px;
line-height: 38px;
margin-bottom: 30px;
text-shadow: 0 2px 1px white;
color: #803030;
}
</style>
</head>
<body>
<script type="text/javascript">
function printMe()
{
print();
}
</script>
<div style='align:center; padding: 20px;'>
<div>
<b>text1</b><br><br>
<h2>
text2 text2
</h2>
<br>
text3 text3 text3
</div>
</div>
</body>
</html>
but here is what I get when I use
[webView stringByEvaluatingJavaScriptFromString:@"document.documentElement.textContent"]
I don't need the body and h1. I only want the actual text that is user facing.
234534546
body
{
padding: 5px;
margin: 0px;
font-family: Helvetica, Arial;
font-size: 12pt;
background-color: #efefef;
background-image: url(ArticleBackground.jpg);
background-position: cover;
color: #000000;
}
h1
{
text-align: center;
border-bottom: 1px dotted #805050;
font-size: 28px;
line-height: 38px;
margin-bottom: 30px;
text-shadow: 0 2px 1px white;
color: #803030;
}
function printMe()
{
print();
}
text1
text2 text2
text3 text3 text3
Thanks for any insight.
UPDATE
[webView stringByEvaluatingJavaScriptFromString:@"document.body.innerHTML"] won't work either for my goal
<script type="text/javascript">
function printMe()
{
print();
}
</script>
<div style="align:center; padding: 20px;">
<div>
<b>text1</b><br><br>
<h2>
text2 text2
</h2>
<br>
text3 text3 text3
</div>
</div>
update: this is needed for an existing project. If I had the chance to redesign it, a solution would be easy to find. But given this HTML source as it is, it might make it a bit difficult.