Quantcast
Channel: VBForums
Viewing all articles
Browse latest Browse all 42231

[RESOLVED] Extract text from page source

$
0
0
I am trying to extract some links from a webpage. Here is my code, but for some reason it's not getting the text.


Code:

dim survpage As String
        Dim webclient As New Net.WebClient
        Dim survpage As String = webclient.DownloadString("http://www.yahoo.com/")
       
    End Sub

    Private Sub Button2_Click(sender As System.Object, e As System.EventArgs) Handles Button2.Click
        For Each line As String In GBA(survpage, "finance.yahoo.com/blogs/", "html")
            ListBox1.Items.Add(line)
        Next
    End Sub

    Public Function GBA(ByRef strSource As String, ByRef strStart As String, ByRef strEnd As String, _
Optional ByRef startPos As Integer = 0) As List(Of String)
        Dim iPos As Integer, iEnd As Integer, strResult As String, lenStart As Integer = strStart.Length
        Dim L As New List(Of String)
        Do Until iPos = -1
            strResult = String.Empty
            iPos = strSource.IndexOf(strStart, startPos)
            iEnd = strSource.IndexOf(strEnd, iPos + lenStart)
            If iPos <> -1 AndAlso iEnd <> -1 Then
                strResult = strSource.Substring(iPos + lenStart, iEnd - (iPos + lenStart))
                L.Add(strResult)
                startPos = iPos + lenStart
            End If
        Loop
        Return L
    End Function

I am trying to capture all the news that has the finance word, and adding the string back when I navigate to it.

I am getting error (Object reference not set to an instance of an object) at line ( iPos = strSource.IndexOf(strStart, startPos))

Viewing all articles
Browse latest Browse all 42231

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>