我想从网页中的表格中提取数据。我的 html 表格代码如下所示
<table class=table_x_write cellspacing=1
bordercolordark=#d5d5d5 cellpadding=0 width="100%"
align=center bgcolor=#FFFFFF bordercolorlight=#a5a5a5
border=1>
<tr>
<td class="title_1" align="right" width="20%">
<div align=right>case number:</div></td>
<td class="item_1" width="30%"><input type="text" name="sgdabh" tabindex="1" value="3710140068" readonly="readonly" style="width:98%" class="bg0" id="sgdabhD"></td>
<TD class="title_1" align="right">participant number:</td>
<td class="item_1" width="30%"><span
onmouseover="setEnabled('cyfbh',1)"
onmouseout="setEnabled('cyfbh',2)"> <select name="cyfbh" style="width:100%"><option value=""></option>
<option value="371014006801" selected="selected">371014006801-passenger car</option>
<option value="371014006802">371014006802-two/three wheeled motorcycle</option></select> </span></td>
</tr>
<tr>
<TD class="title_1" align="right">licence number - city:</td>
<TD class="item_1"><input type="text" name="cph_cs" value="豫" style="width:98%" class="bg0"></TD>
<TD class="title_1" align="right">licence number - letter:</td>
<td class="item_1"><input type="text" name="cph_zm" maxlength="5" tabindex="6" value="C" style="width:98%" class="bg0"></td>
</tr>
<tr>
<TD class="title_1" align="right">licence number - number:</td>
<TD class="item_1"><input type="text" name="cph_sz" maxlength="5" value="8****" style="width:98%" class="bg0"></TD>
<TD class="title_1" align="right">type of vehicle (own definition):</td>
<td class="item_1"><span
onmouseover="setEnabled('cllx1',1)"
onmouseout="setEnabled('cllx1',2)"> <select name="cllx1" style="width:100%"><option value=""></option>
<option value="2">2-two/three electric wheeler</option>
<option value="3" selected="selected">3-passenger car(≤9 seats)</option>
<option value="4">4-HGV</option>
<option value="5">5-bus(>9 seats)</option>
<option value="6">6-military vehicle</option></select> </span>
</td>
</tr>
</table>
我试过下面的代码
Sub data()
x = 0
Set objShell = CreateObject("Shell.Application")
For Each w In objShell.Windows
If w.Name = "Internet Explorer" And w.LocationName = "vehicle data" Then
Set externalIE = objShell.Windows(x)
Set externalIEDoc = externalIE.Document
Set elemCollection = externalIE.Document.getElementsByTagName("TABLE")
MsgBox (elemCollection.Length)
For t = 0 To (elemCollection.Length - 1)
For r = 0 To (elemCollection(t).Rows.Length - 1)
For c = 0 To (elemCollection(t).Rows(r).Cells.Length - 1)
ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).innerText
Next c
Next r
Next t
MsgBox (w.LocationName)
externalIEDoc.getelementsbyname("btnDown")(0).Click
End If
x = x + 1
Next
End Sub
但我正在获取垃圾数据。我需要的值为案例编号:3710140068 参与者编号:371014006801-客车 我需要从文本和选定选项中提取数据。请帮忙
您是否需要比从以下内容中获得的更多内容(注意:这里我正在将您的 HTML 从单元格读取到 HTML 文档中。您可以按照正常方式进行操作。)。它显然不是非常健壮,但我认为您不会简单地循环行然后循环其中的列并执行 HTMLCell.innerText
①早期绑定:
代码:
Option Explicit
Public Sub GetTableInfo()
Dim html As HTMLDocument
Set html = New HTMLDocument
html.body.innerHTML = [A1].Text '<== You would obtain in normal way. I just read your HTML in from sheet.
Debug.Print "Case number: " & html.querySelectorAll("table tr td[class=""item_1""] input")(0).Value
Debug.Print "licence number - city: " & html.querySelectorAll("table tr td[class=""item_1""] input")(1).Value
Debug.Print "licence number - letter: " & html.querySelectorAll("table tr td[class=""item_1""] input")(2).Value
Debug.Print "licence number - number: " & html.querySelectorAll("table tr td[class=""item_1""] input")(3).Value
Debug.Print "participant number: " & html.querySelectorAll("table tr td[class=""item_1""] option[selected=""selected""]")(0).innerText
Debug.Print "type of vehicle (own definition): " & html.querySelectorAll("table tr td[class=""item_1""] option[selected=""selected""]")(1).innerText
End Sub
② OP后期绑定版本:
没有看到 HTML 更广泛的上下文,这里有一段有点脆弱的代码:
Option Explicit
Public Sub GetTableInfo()
Dim html As Object
Set html = CreateObject("htmlfile")
html.body.innerhtml = [A1].Text '<== You would obtain in normal way. I just read your HTML in from sheet.
html.getElementsByTagName ("table")
Dim b As Object
Set b = html.getElementsByTagName("table")(0).getElementsByTagName("input")
Debug.Print "Case number = " & b(0).Value
Debug.Print "licence number - city: = " & b(1).Value
Debug.Print "licence number - letter = " & b(2).Value
Debug.Print "licence number - number = " & b(3).Value
Dim c As Object
Set c = html.getElementsByTagName("table")(0).getElementsByTagName("option")
Dim i As Long, n As Long
For i = 0 To c.Length - 1
If InStr(c(i).outerHTML, "selected value=") > 0 Then
n = n + 1
If n = 1 Then
Debug.Print "participant number: " & c(i).innerText
ElseIf n = 2 Then
Debug.Print "type of vehicle (own definition) " & c(i).innerText
End If
End If
Next i
End Sub
输出:
这与 .querySelector 相同。
不幸的是,豫出来了?
③解析 HTML:
您还可以解析 HTML。为您的实际完整 HTML 量身定制,但您可以通过以下方式使用所提供的代码段来完成此操作:
Dim arr() As String
arr = Split([A1].Text, "class=""item_1""")
Debug.Print "Case number = " & Split(Split(arr(1), "value=")(1), Chr(32))(0)
Debug.Print "participant number: " & Split(Split(Split(arr(2), "value=")(2), "selected=""selected"">")(1), "<")(0)
Debug.Print "licence number - city: = " & Split(Split(arr(3), "value=")(1), Chr(32))(0)
Debug.Print "licence number - letter = " & Split(Split(arr(4), "value=")(1), Chr(32))(0)
Debug.Print "licence number - number = " & Split(Split(arr(5), "value=")(1), Chr(32))(0)
Debug.Print "type of vehicle (own definition) " & Split(Split(arr(6), "selected=""selected"">")(1), "<")(0)
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句