close Lefora Announcement: We recently turned back on the 'Send Invites' link in the userbar. New features include the ability to hook directly into your Hotmail, Yahoo, or Gmail accounts. Click here.
Track Topic
: rss

Topic: Web scraping

posts 1–11 of 11
Page 1
member
3 posts

As someone starting to look at this, can anyone suggest any useful resources, websites etc? I am already using the betfair API and would now like to supplement this with data fom several other websites (Racing Post, Sportinglife, Sporting Index). I would be very interested to hear from others about languages, tools and any hints & tips you might care to offer.

member
30 posts

As far as tools go, this is one of the best:
http://www.fiddler2.com/fiddler2/

member
30 posts

Also, in what language have you been doing your API development?

member
3 posts

Thanks Nadat

Fiddler2 looks promising and I will download and try it out this evening. My API development has been Excel & VBA to date but this isn't meeting my needs any more. In the early stages of migrating to PHP & MySQL. Initial signs are that this platform will provide sufficient flexibility for all I might want to do.

member
30 posts

MySQL was overkill for me. You might want to have a look at SQLite [http://www.sqlite.org/]. Afraid I don't know much about scraping with PHP, but Fred might be able to help you there.

member
90 posts

I don't do much scraping at moment, but I just typed scraping with php into google. A couple of years ago such a search didn't bring up much of use, but now it brings up everything you need, much more help than I ever could be :)

member
25 posts

Depends what OS you're using (and Excel/VBA seems to imply Windows), but on linux, MySQL is a sensible choice - dead easy to set up and it's ubiquity means there is loads of information, help and advice available for free online.
Two years ago I would have said it was the only sensible choice, but lately I have been hearing a lot of good things about PostgreSQL, which seems to be undergoing something of a renaissance. This is a VERY powerful,enterprise level database - if I was starting from scratch today I would seriously consider it, and even with 25 GB of existing MySQL tables, I am thinking of migrating.
I use perl rather than PHP, and the main database library, DBI, is designed to be as agnostic as possible regarding back-end databases. This means you can swap your back-end with minimal code changes (providing the schema stays the same). Not sure, but I think PHP has a similar idea.

member
30 posts

Does it really depend upon your chosen OS? MySQL and SQLite are both open source, cross platform, documented and well tested. However, one is a client-server DBMS and the other is a 500 KB library. I'd suggest it depends upon your application's requirements.

member
11 posts

Fiddler or similar http debugger is a must when dealing with any reverse engineering. Basically you have got two possibilities when parsing html pages just simple text parsing using regex, or simple text searching and subtracting or DOM browsing I prefer the last one. As most web site at last enter into the web 2.0 age you should have a look at JSON and xml. Betfair web site uses JSON. A programming language is your choice but in my opinion you should not waste your time with languages which purpose is somewhere else.

member
25 posts

I think nadat hit the nail on the head - all these choices depend on your application. Work out what you want to do first and pick the tools to suit.

member
124 posts
How about VB6? It saves the hassle of learning a new language and i've written some nice bots in VB6 as it is/was my main language.
posts 1–11 of 11
Page 1

This Topic Is Locked To Guest Posts

It's been a while since this topic was active, if you'd like to get it going again, please post as a registered member

join now