Jonathan Leger – SEO And Internet Marketing Blog Internet Marketing Blog

10Dec/08Off

Preview my new web data parser!

I'm always needing to extract data from web pages that aren't setup to export the data: keyword data, link data,
traffic data, you name it.

What I've always done in the past was write a script or a quick application that was designed specifically to extract data from a particular page or set of pages. If that sounds like a lot of work -- it is, though the value of that data made it worthwhile.

I finally got smart and said, Hey! I'm doing this all the time. Why not create a tool that can extract data from just about any table from any web page?

Thus Web Data Parser was born. This powerful tool lets you extract the data from any table on any web page and save it in a useful format (CSV for spreadsheets, TSV for databases, and as an HTML table). It also will extract the links and link text from ANY section of any page you choose.

This tool is not ready for release yet, being in the beta stages. I'm wanting to get your opinions and ideas for how to make the tool better. I want to find out what functionality and features YOU need in a tool like this.

So click here to the preview video.

Then leave your thoughts and comments and suggestions below.

Related Internet Marketing Q&A

  1. How can I put my advertisement on Internet Marketing Q&A?
    14 Answers Available - Miscellaneous - Asked 1692 days ago

    Want To Put Your Ad In Front Of Thousands Of Visitors A Month For FREE? You can, and it's easy! Just register for an account here at Internet Marketing Q&A. Then be an active member by answering people's questions, asking your own,...

  2. Is web 2.0 still relevant 1n 2013?
    6 Answers Available - Internet Marketing - Asked 1638 days ago

    I would like to know if web 2.0 sites and marketing methods are still seen as useful today, that is in 2013. Can I still create Web 2.0 sites and be competitive in todays online business enviroment?

  3. How YOU can earn powerful software from Internet Marketing Q&A without paying a dime!
    14 Answers Available - Internet Marketing - Asked 1698 days ago

    Get Free Internet Marketing and SEO Software Tools By Being An Active Member of the Internet Marketing Q&A Community   Here at the IM Q&A you earn "points" on your account for doing all kinds of things: asking questions, answering...

  4. How long does my site need to start getting visitors?
    3 Answers Available - Search Engine Optimization - Asked 1690 days ago

    I have started some new websites. I have bought old domain names with good pageranks. I am adding a lot of fresh content every day for 20 days. I also do a lot of backlinking and SEO, but... still no visitors! How long does it take to start getting...

  5. Is There A Way To Suggest New Categories If The Topic Is Not Already Listed?
    5 Answers Available - Miscellaneous - Asked 1680 days ago

    I have several questions that do not really fit in some of the categories listed on the left hand side. I was wondering if anyone here new of a way that we could create a new category, if it has not already been listed? How many other members here...

Filed under: #All, Web Design Comments Off
Comments (151) Trackbacks (0)
  1. Really a good and useful tool that had many options that makes people for ease of use..It will benefit all the people who want design on their own.

  2. That’s a pretty sweet app! And a nice video… I’m impressed

  3. I think this new tool will help me a lot. I’d really like to see it have the capability to export to a .csv file. I’m looking forward to trying it out. Thanks.

  4. You just keep coming out with tool after tool, don’t you? I don’t happen to need this one, but I am seriously considering 3WayLinks, since you’ve had a lot of success with that.

  5. Awesome idea! How much will this product cost? And will it allow me to export the data into an excel spreadsheet?
    I’m definitly interested in this…Please keep us updated.
    Thanks!

  6. Idea was good to extract the content in a table format…and i think its better to have a edit option to change the content……

  7. Can it extract information from a YouTube video? Like listen to what there saying and type it up for me? Or from a movie that I get from a torrent?

  8. Looks like a great tool John, two things I’d like to see.

    1. The ability to leave out individual columns of data e.g. nos.
    2. Extract site wide webpage titles and descriptions which would be useful when checking for duplicates etc.

  9. Handy tool. I might pay up to $9.95 but not more as a stand-alone PC based tool.

    I would personally value and use the tool much more if it would execute under Linux with an API (or source code license) allowing me to embed the functionality in my own PHP scripts. I could then chain multi-page GETs and add a MySQL database backend. Nirvana is a scraper’s tool kit!

    Proof of functionality will be in the pudding – e.g. will it extract css tables? So, I hope you will use real beta testers before release.

    A useful option for international users would be to toggle decimal point representation between period (USA) or comma (Europe).

    Needs to output directly into Excel (.xls output format) to save time time consuming intermediate steps via .csv or .txt.

    Major extension (worth extra money) would be PDF capability.

    Steve1943

  10. Jonathan,
    It truly looks like a valuable tool. It would be nice if the tool could extract keywords/keyword density from the page and site description. I have dealt with putting together the largest directory of English sites in Japan, but it is extremely difficult to build a database because Japanese sites often don’t worry about being SE friendly, especially not in the English side of their site.

  11. Jon, this software will help me to extract list of domains from the registrars site, where i can sort them, translate them, etc..
    It’s very powerful when you have hundreds of domains. Pretty amazing tool that I personally need very much.
    When will you release this tool for purchase?

    Jenna

  12. Jonathan,
    I am just getting started in the internet and I can still kick myself for not getting “Instant Article Wizard Pro 2.0′ when you first put it out last March/April 2008. Because of that experience, I did, in fact, recently purchase WebCompAnalyst, even though I am not ready to use it. After seeing the quaility of your works, I knew that it would be a good deal and could put in my tool chest until I was ready to learn and use. In reference to this “Extract Data” software, I will defer making any comment for the same reason, but hope you will keep me posted when it is ready. I probably will think up some things to use it before that time and if I do, I will let you know.
    Jay

  13. Hi Jon!

    This seems to be a quite good tool. It would be great if it could extract at .xls format or intergrate with Microsoft Access for easy data manipulation.

    Cheers!

  14. This is great. I’ve been looking for something like this for ages. I can’t count the number of times I’ve wanted a definitive list of something and ended up at a Wiki page – then copied and pasted the tables there.

    Then I’d be spending the next three hours trying to untie the copied stuff into a list that made sense and looked readable! A tool like this will save all that. Because I use data feeds and data manipulation a lot I estimate it will save me at least 10 hours every month.

    A very timely product indeed! Nice one, Jon.

  15. Since video is the new web 2.0 I would like to be able to
    extract a web video link even from a youtube source.

  16. Jon,
    I’d like to have the ability to remove columns or rows from the selected table, or possibly re-arrange the columns in the exported data.

    Looking forward to it!

  17. I’ve investigated for many years for a tool like this and we use daily two of them to grab arrival-departures timetables from airports websites. Two features are high demanded: script execution and xml data format. Is there a beta-test or demo version ready?
    Thanks
    Mimmo
    Turin (Italy)

  18. great tool,

    would appreciate if your tool can extract the prices of stock for high price and low price for each time frame eg. from 1 mins to 60 mins and daily to monthly over a 10 year period would be good.

    Not only the prices but the time it took to reach that high or low in a given time frame.

    cheers,

    cynthia

  19. Hi Jon

    Interesting tool. Once place I might use this tool, is in Google, when I would be searching for possible sites to do a link exchange with.

    I could save a table of 100 sites that get returned for a search term, with their links, and use the csv to keep track of the sites I found in my search, that I have sent emails to, and have gotten replies and have reciprocal links with.

    Then I could do a search another day and append to my list, remove any duplicate sites (unless your software can do that, only append when not a duplicate), and that way I can keep track of hundreds of sites and which I contact.

  20. I keep track of property prices in my area, and on a typical search there are maybe 10 pages of listings. I copy and paste the text to notepad, then use a very complicated excel sheet that I’ve developed over time to extract the listings into a single spreadsheet.

    If I had a tool to do that in one go I’d jump at it.

  21. Jonathan,

    That looks very useful. You sure hit the nail right on the head for Wordtracker.

    My challenge will be for this to have some form of campaign/project management integrated.
    I envisage that the files stored might be kept in the program and that in future, when we see other relevant data we would be able to select the appropriate file from within the program (eg. drop down) & the data would automatically be be appended to the appropriate file.

    I can see this might become tricky if we were ever to move the file on our computer, but it might be that your program has a specified data folder for all files to solve that.

    I’d like it to be “always available” in Firefox but not consume too much resource while sitting there. (ie. Not have to open it specifically every time I wanted to use it) like a Plugin works.

    I’m looking forward to having one…

    Cheers
    Kerry

  22. I have been trying to find a way to collect only new posts that offer a free item. At the moment I am creating a freebie list from Google Reader.

    I would need their URL and also an image. I am not sure if your software can do this, but it is certainly worth asking.

    If there is a software out there that does this, I would really appreciate anyone getting in touch with me.

  23. This is really smart idea. funny i hadn’t thought of it before. ranks up there with ketchup.

  24. Hey Jon,

    Another useful tool!

    I have problems when searching through lists of expired domains, to extract just the domain name without the dates, prefixes, etc.

    Might want to add a feature to extract by column.

    Kind regards,
    Dave Jackson
    Naples, FL

  25. Hi john,

    … am a webbot developer myself. Your tool looks very good and there certainly is a market for it. :-)

    Keep Going

    All the best
    th

    P.S. If you need a beta tester … for testing the system on different machines, just sent me an email.

  26. what about an rewrite article programs that rewrite articles?. I would find that usefull. Any ways or plans to make that software?

  27. awsome tool jonathan your a awsome programmer and the tool looks very usefully…Do you just copy the stuff on a page and upload those words on your site?

  28. Great tool Jonathan. Can you add the functionality to add to a seperate tab in excel when appending data?
    Keep up the great work.

  29. Interesting applications.
    Two things come to mind. First being able to extract a portion of a table.
    Second and what would really interest me, is to be able to extract date from a directory sort of website, for instance the yellow pages. To be able to assign values, such as an email address, or name to each column. This may be beyond your intent of working with tables, but would sure enhance the value.

    Thanks,
    Corky

  30. I would find this extremely useful. I have to try and extract prices/offers/dates etc from websites of my suppliers of flights and cruises to ensure that the offers we display on our site is up-to-date.

    Although we have found a way around doing this it is a very time-consuming process.

    This would make life so much easier.

    Colin

  31. I often need to extract protected images (those having a blank layer on top) and videos. Would it be possible to do that with this tool?

  32. Hi,

    Nice tool, but it its a small adjusted version of you back link analisys tool, i don’t remember its name. but i think its worth having in collection for a marketer.

    Cheetu

  33. I use Micro Niche Finder for keywords. Generally, if I want to save any information on a webpage, I’ll do like of the guys here said, copy and paste into excel. For commas I will use SUM=A2&B2 with a comma in B2, and the keyword in A2.

    It isn’t as useful for forms as demonstrated though.

    I think it’s a great tool, but Id like to think that it’s another expense when perhaps if you are organised enough you’ll be saving money (sadly not time :D )

  34. This sounds good for our many (thousands) of our subsribers who are photographers. Although they take pictures across the board, many of them gravitate to niche areas.
    Finding, updating, and processing those (often) hidden niche areas would be of benefit to them, if they had a tool to do this.
    Put me down as a beta tester, Jon.
    Thanks

  35. Hi Jon,

    Once again you’ve come up with something that makes me wonder why no-one ever thought of doing it before.

    A very nifty and handy tool to have on the desktop for keyword analysis. As someone has already mentioned, extracting images would be good.

    Keep up the good work.

  36. I like the idea of being able to extract all the data from Google’s External keyword tool. search volume, CPC, trends etc.

    I see that I am not the only one so maybe you will try to include this feature in your tool.

    Thanks

  37. I would like to have a little more formatting. In the past when I have used scrapers I have always wanted to remove characters at the begining of a line or at the end of a line and sometimes in between – so the ability to remove/replace a character string with a character string would be useful.

    Additionally, these options would be cool:

    1. The ability to remove blank lines.
    2. Convert tabs to spaces and vice versa.
    3. Remove leading spaces and trailing spaces.
    4. Copy all or part to the clipboard.

    There are others but less important. Hope this helps,

    Dave MacGregor

  38. Great tool Jonathan, I can use this in many ways. I would love to be able to remove columns from a table, particularly those columns that contain non-alphanumeric items such as tick-boxes which are a real pain sometimes when importing into Excel.

  39. Another great software application Jonathan!!!

    WebCompAnalyst has changed my thinking on SEM. You have probably covered this but I would like to grab ‘price comparison tables’ for any specific product and use it on a ‘call to action page’.

    Thanks again, John

  40. I need your program asap so I can easily import data from other websites in excel format so I can upload the data into a database web app called dabbledb.com.

    How soon are you going to release the program?

  41. Jon, This a real time saver for niche site builders. The suggestion of dragging of the data like “Ulitily Poster” to a html editor. The above posts from others have cover alot of neat options, as well.
    Good Job Again.
    Thanks,
    Wallace

  42. Jonathan,

    The Link extract seems to be most useful to me. If that feature could be extended to not just include tables – but entire web pages – it would be a terrific tool to quickly gather all the links on a page to look for bad ones, or those that need to be updated.. in scenarios where you need to change your links due to an affiliate change of some type.

    Even better bet it could manage all the links on all your pages making it easy to update, fix dead links, remove on affiliate program cancellations etcetera. Link inventory is a tedious task. For example I get an email saying affiliate x is now owned by y and i need to get all my links updated to y’s flavor of them. If I’m a big user of x’s links all over my site, it’s a real hunt and peck operation. This product might be a great first step to making that much easier. Also check the links pages for reciprocating links, the list goes on and on.

  43. How about something to sort the products from an ebay seller based on total amount sold. That way you can know what is a good product to sell and for how much in a certain category

  44. This looks pretty good. I have been thinking about something similar myself for similar reasons. If you are releasing this then I won’t have to bother :)

    One suggestion: have it watching the table for changes?

    For example in your video you go to a stocks site and grab the table, have your program watch this page so that it can update or append the data you extract yourself and monitor it. Let’s say for example the top sellers list at amazon or google trends data etc…

    Sorry if that doesn’t make sense if not ask and i’ll clear it up.

    Thanks
    Phil

  45. Hi Jon

    Would be called if you can extract by highlight the items that we want to extract and export it to table. Let’s say view a page source code and then get the meta keywords from a site that we see as our competitor and put them into random .. for keyword testing. This can be an invaluable addition

  46. Hi Jon
    I’d like to know if the data can be re-inserted into a table structure.

    It would be very helpful to me for quoting other webmasters data tables in my own blog. This is mostly property sales and rental figures for different states which I post in one of my blogs.

    Great piece of software. Most innovative thinking!

    Top marks!

    Gerry DR

  47. I would like to see the ability to extract contact us data from a website, information which might include any physical addresses, phone numbers or email addresses present. Also then being able to query whois and pull back the admin, technical & billing information.

  48. Would like to be able to extract a table of data from sites like Hotels.com or Yellowpages.com, or other similar sites. Not sure these are even presented in a table on the webpage, but extracting data like that would be very useful.

  49. Interesting tool. How about combining this with gistweb for text mining as well :)

  50. Another great tool that can save us all alot of time.

  51. Another great tool by Jon, extracting data has never been easy for me,

    Unique software, good stuff

    Jason

  52. Another great tool by Jon, extracting data has never been easy for me,

    Unique software, good stuff

    Jason

  53. Nice tool design Jon – and it looks like it works very fast.

    I would like the ability to extract data on a web page with email addresses and names, like the listing in Clickbank of the customers that have bought from you or through your affiliate link. This would allow future follow-up with them and to send bonuses for buying, etc.

    I look forward to seeing the end result of your labours :-)

    Barry

  54. Hi Jonathan,

    A great looking tool. I would also like to be able to use a tool of this nature for content research. In other words I would like to be able to extract text (a paragraph, a run of paragraphs, a list etc.) from web pages which could be put into a single text file as research perhaps for website content or an article. It would also be useful if I had the ability (optional) to also save a link to the page from which the information was extracted.

  55. Great tool Jon,

    I would like to be able to extract data from affiliate networks to populate ecommerce pages. This data would incluce the product plus commentary by the network to make a multi-page wensite. It would also keep the links so that a buyer clicking on the item would go back to the vendor site and my affiliate code would also be maintained for sales credits.

  56. Hi Jon,

    It would be great if your program could collect all the headlines from a sales letter (all h1, h2 and h3) and also all bullet points. This would be a very easy way of retrieving the most important data from a sales page.

    Thanks. Very interesting software

  57. Am particularly impressed with the link extraction! I could pretty much do the other extractions (Wordtracker, Clear-Station, etc) onto a spreadsheet with some quick formatting, but the link extraction’s definitely a different offering completely.

    Yup, would be good to be able to extract pdfs as well.

    Other suggestions would be more powerful/flexible formating options, append, concatentate, etc.

  58. Keyword work. After building a huge keyword list from several sources, if your tool had the ability to scan all the keywords and remove all double ups that would be a great time saver.

  59. Hi
    Another great tool being tweeked. I would also like to see if the links are affiliate ones from some of the major sites. I also like the idea of revisiting the site at a later time to see what has changed or been added, with the option to just select individual pages. Another idea if it is possible would be to determine if there is a particular theme going on. An example would be “this site seems to be 70% abc related”. Looking forward to trying it out.

    Smiles
    Kim

  60. Jon,

    Will it work on DIV tag ?

    and how about this type of website ?

    http://www.turfclub.com.sg/tabid/203/ctl/ResultDetail/mid/824/ItemId/471/r/-1/Default.aspx

    will lead to detail page like :

    http://www.turfclub.com.sg/tabid/207/ctl/details/Mid/1081/Default.aspx?horsename=GIDDA+BOY

    It will be great if it can follow the link like to 1, 2 or 3 levels.

    My current method of doing it is via EXCEL VBA … but different version of EXCEL gives me different problem. That’s my headache.

    Regards,
    David

  61. Jon,

    Kudos again dude!

    This reminds me of some Dapper stuff I have seen as well as some RSS mashup techniques.

    I think you are on to something with this.

    #1 NOW, I think it would be awesome to be able to have several of this active on one page so when the button is pressed it scrapes out the data.

    I was thinking about applying this to Hot Trends marketing. Find the product from say Amazon. Block out your needed info and presto you have the data. I am still doing the copying and pasting and stripping out the odd characters.

    #2. For tables it would be great to turn off either rows or columns depending on what was needed.

    #3 I guess this is more of a question but How do you process Tables within Tables. Hmmmm

    At any rate. You’ve impressed me again.

    Great job!

    Buddy

  62. Nice tool, how about extracting rss feeds and what about data feeds for affiliate products mmmm !!

  63. Hi Jonathan

    I would love to be able to watch the video, however it stops dead at the introduction sample of wordtracker. …nothing else.
    I would really like to understand what this software is about.

    By the way, unrelated, I have the article wizard which has truly helped learn to write articles. thank you for that tool.

  64. Yes, great tool.

    Yes – PDF extraction
    Yes- Image Extraction
    Could this thing pull audio/video off a page?

    Thanks,

    Dave Turo-Shields

  65. Hey Jon,

    How about extracting all the domains and related domains and also visited domains from alexa and quantcast?

    Then the ability to strip them down to just the domain itself?

    You’re on to something useful.

    John

  66. Not sure I can think of anything else right now (that others haven’t already mentioned) but I will tell you that my first response was “wow”. ….but then my ultimate response was “phenomenal” when I saw that you could just keep adding to the same spreadsheet. That is so valuable! Thank you for that…but the only question about that is does it make new worksheets which would be ideal or are all the data results exported into the same worksheet in the file?

    You are amazing and I”m very appreciative of all you do!

    Terrie

  67. Hey Jonathan,

    This looks interesting. But I can live with out it. There are other tools you’ve made that were must-have’s for me.

    I know this is way off topic, but if you’re looking to build another must-have, grand slam of a tool…most of us would sacrifice a limb or two for a tool like Affiliate Elite.

    …A tool that would allow you to type in a url address and spit out the keywords they’re using in their AdWords campaign.

    You’re the best, Jon

  68. I noticed that you don’t have a way of saving the title of a table. That title is very important to me for later identification of what the data is all about.

    Also, it would be helpful if there was a “Notes” field where we could add comments. Raw data without comments can be of less value.

  69. Wow! What fantastic suggestions. I’ve already got a great extra list of powerful features I will be building into Web Data Parser in the next few days / weeks.

  70. How about grabbing text lists that are not in a table.

  71. WEB DATA PARSER Suggestion:

    How about a feature which allows you to supply a batch of website URLs and allows you to load each web page in turn, extract an image from the page and stores the image alongside the web page URL (so the URL can be used as the image retrieval key).

    Many Clickbank ad generators I use only supply a text description and a link to the product web page, whereas I’d like to display an image of the product as well in my ads. The feature I suggest will allow me to quickly obtain an image to go with the CB product ad.

  72. Hi Jon,

    Thank you for always coming up with very useful softwares like this, if I may suggest something, aside from the URLs, how about being able to use keywords or niches as search bases for data gathering. A feature that would also find forums and blogs based on the typed keywords would also be nice.

    Cheers!

  73. Hey….this looks great. Something I am looking for is for keyword LSI. I want to be able to do a search of each site I am checking out and add the number of times the keywords are used and delete repeats. So if I go to one site and look up dog collars and see that the word spiked dog collars is used 6 times on the page and then go to the next site and find the same word is used 3 times, then I want to be able to only have spiked dog collars to appear once on my list, but I want my count to show 9. Make sense? Looking forward to data parser.

    Thanks,
    Duane

  74. Wow…. your data parser is something I’ve been wanting for years. I can’t wait to get my hands on it.

    Only feature I would add is the ability to include / exclude columns in a table. Not a biggie because I can do this once I download the data into Excel. But since you asked….

    Thanks,
    Fred Gagnon

  75. Nice tool, Jon!

    How about extending it to pull “meta tag” information from the HTML code?

    Scott

  76. Well John you did it again another really great software that I will buy and use often unlike many other programs I paid for and did not get my moneys worth. I also would like photos. Please let me know when this is ready I will be on line to buy as always
    Thanks Safe & Secure

  77. Great tool, a data mining configuration for extracting data like email addresses, names,etc.,that are not in table format, as mentioned above would be a good enhancement.

    Have your Instant Article and it is very helpful.

  78. Jon I think you’ve got it back-to-front.

    I can always very quickly select a table on a web page and drop it into a spreadsheet. And append that way as well.

    But hey! That last little bit where you can extract links . . . now THAT’s one hot little item!

    I’d market it as a link parser and add the tables data as the free bonus.

    Then again, I’m making a shipload LESS money than you on the web!

    All the very best though mate — you’re on of only three people I have time for (Bill Myers and Dennis Becker are the other two), and I love the products I got off you — Instant Article Wizard and Web Comp Analyst

    Go for it son — you’re da bomb!

  79. Jon,

    Great tool!

    Might have missed this but does it have the ability to extract links/ link text in text rather than in just tables?

  80. I think this is a great tool. Many times have I encountered the very problem your tool is designed to solve. Honestly, I can’t think of anyway to improve what you’ve already done. I look forward to its release.

  81. Jon,

    Another great and useful tool! I will have to ditto on some earlier comments that I would like to see something like this for PDF’s

  82. Jon,
    I just spent 4 hours extracting info to place on a spread sheet. Can’t wait for the release. Good job.

  83. Hi Jonathan,

    Very impressive tool; seems like it could be most helpful. The most direct suggestion I can give is could this tool be made to not only extract tabled, listed or grouped data, but could it then save it also in a small variety of most commonly needed upload formats?
    An example is when I use a big list of keywords saved as a CSV. When I try to paste it in, I only get the first word (thus I have to copy the CSV to a notepad, hit “comma” after each word then “delete” then “space” — ad Nauseum…
    I hope this question isn’t too stupid for words…

    Thanks in Advance,
    Keli

  84. Extracting from AdSpy Pro and (.pdf)’s are the main areas that come to mind at the moment? Looking good though!

    Harold…

  85. how will you get around getting the ip address that is scraping the data from getting banned by the source url?

    also would be nice if it could do pdf documents

  86. It’s a nice little tool, but it has a lot of limitations in its current form.

    A more useful feature would be able to record and run scripts and capture series of pages.

    Also, I am not too sure if it allows to capture ultra complex pages such as google Adwords pages with anarchic table structures.

    An other suggestion would be to export directly to MSaccess

    Finally, it seems you need to select tables using Firefox. How about IE ?

  87. Funny, I was just trying to extract a list of words from wordtracker and having problems.

    I can’t think of any other uses but I think a low price on this would get my $ since it would not be something I used every day like i have been doing with Wordcomp.

  88. To be able to pull links for internal and exernal from the whole website through multi levels would be very handy.

    giving a full list of all links from a complete websites seperated into internal and external if possible with anchor text.

    GRAT idea :-)

  89. I think it is great and like the fact that it isn’t complicated. Lots of good programs are “spoiled” but trying to do too many things at once.

    Would have saved me lots of headaches over the years!

  90. Hi Jon, great idea, thanks.

    One thing that would help is automating the extraction. So, say I save 5 sites with tables that I need in a csv file. I would like to be able click a button and have the Data Parser go into each site, extract the data and save it to the file specified.

    I guess you could add another tab and call it “Profiles” or something.

    In your Clear Station example, you wouldn’t want to go into the site manually everytime you wanted the stock prices. You may as well just use the browser. But if you had an Excel program working off this data, you’d want all your data to be updated with the click of a button.

    So far though it looks awesome! Good work.

  91. Jon, you’re always ahead of the curve, love your tools. I was just last night copying and pasting from WordTracker so can see this being a great time saver in that regard.

    I think Stephan has some excellent points, but programming is not my thing, so don’t know what is involved.

    Curious as to the price, was lucky on IAWPro, was in on the beta, glad I’m your list.

  92. Jonathan – your new tools great. I do not know how many hours I have spent copying and pasting information off tables.

    It looks pretty good to me right now; I would have to play with it for a little bit to get a better idea on how to improve it.

    Looking forward to it.

  93. Hi Jon

    I pesonally work more with text than numbers. One way I use Instant article Wizard is to preview the statements generated, go to the website, copy and paste text to a word processor. Move to the next site, repeat – and so on. After a half dozen or so site visits I generally have enough information to craft a good original article with this as guidance.

    If your new program could extract text and apend it, as for numbers, that would save me two steps in my current process – and that equates to a worthwhile economy of time.

    Sometimes, the sites do not allow copying and in that case I use “Snagit” to capture what I need. That adds another step.

    Then PDFs. I can see it as a handy tool in re-writing PLR books and reports – re-ordering sections, etc.

    All in all, this could be my new “best friend” in managing and massaging the written word..

  94. Make the option to customize the “Remove number formatting” in Italy we use the comma (instead the point) for decimals.

  95. Great Stuff Jonathan, I have looked for something like this when sorting keywords on the Google Keyword Tool. It is really going to be a useful piece of kit!

    If it could be linked someway not only to keywords but numbers of searches together with available domain names too that may be connected to those keywords, that would be even more awesome.

    What else? Well how about an affordable price that makes it within reach of most during these difficult times.

    Regards, BP.

  96. It would be helpful if you could add a column(s) That would do
    simple math to the extracted data.

  97. I would like to be able to export to Word, and to select certian fields with getting the entire table. I can’t wait for you to releas this, I am sure it will save me a lot of time.

  98. if it could crawl the same table and multiple pages of a site and extract that data and apend to a file that would be awesome! ie: crawl one site and all of its links or pages and grab the info from the one same table.

    when can we try a beta test of this baby?

  99. Hi Jon,

    I think this would be a great tool and very helpful for us to export our bank statements to excel so that the data may be imported to quickbooks . Our current method is not so efficiant. I also think it would be great if it would work for PDF’s as well. Im sure we would also find many more uses for it as well. Looking forward to giving it a try.

    Thanks,

    Greg

  100. That’s one of the most useful things of the latest years !
    I will wait it.
    I hope also that extraction tools have a checkbox to extract also the link connected to table.

    I think you will sell more if you start with an increasing price.

    Keep up the good work, you are a Winner !

    Thanks again.
    Alessandro

  101. I am pretty sure you can do all that in excel….with the past special (text) function…

  102. Does it matter if the table in question is created using CSS, or does the program grab what is on the surface?

    Also, I grab company data from a subscription site that doesn’t allow right-clicking in order to copy and paste the formatted tables. Would your program bypass the right-click protection feature on some sites?

    Also, similar to what Stefan mentioned above, I would like to be able to grab multiple pages of table data from a site when tables are spread across more than one page at a time to cut down on copying and pasting.

  103. Great tool, I certainly would have some use for it.

    Sometimes I like to grab articles etc on web pages but sometimes you then you get all the extra that you don’t want. Is it going to be possible to grab other content as well not just tables?

  104. Jon, looks very useful, I agrre with some of the comments asking for a multiple page system Thanks MAlly

  105. I am sorry, but I wouldn’t spend any cash on this. Simply because I already have such a tool and I didn’t have to shell out big bucks to get it (my programmer made it in like 2 hours time). So I also do not see how this is really valuable.

    I thought you were talking about a data mining tool that analyzes and grabs valuable SEO data. So you can SEO your own sites, accordingly. Even get links on the exact pages your competitors have their links on. If you’ll give this for free and sell such a SEO tool as an upsell… you can definitely count me in. Because to create a SEO analysis tool like that, will cost A LOT of valuable programing hours. And it would definitely be worth some money in that case. But this imo not.

    I am sorry to say this, because I truly like your other stuff. And hope I didn’t offend you with my comment. But those are just my 2 cents. Hopefully it was useful to you.

  106. Very nice tool but you really got my attention with the bookmark extraction. Yes, PDF support would be very helpful. If it could work against Copyscape-protected sites, then you’d REALLY have something. ;-)

  107. There are so many possibilities with this kind of software. I would like to be able to extract data in the form of reviews from a site like buzzillions.com or amazon.com I’m not sure how you could make that easy but maybe have it keyword based and then set it to how many reviews you would like to extract and from what sites. I’m using your webcomp analyst almost daily so if you can get this going with enough features to save us even more time that would be terrific.

  108. Jon;

    I would definately like to see some scripting or automation. I am always needing to do the same thing over and over.

    Thanks!
    Mark

  109. Jonathan,
    You forgot the most important feature! Search results! Ideally, Web Data Parser should pull the data for all external links from all pages in the search results for a given keyword. It would crawl each URL in the “Top 100″ results, for example, on a google search. Then it would take the external links from all of those pages.

    let me know as soon as this feature is available. When it is, I’d be willing to pay money for web data parser.

  110. Jonathan,
    This looks like a great tool. One thing I didn’t see was configuration abilities such as setting proxy settings, etc.

  111. Jonathan,

    Great stuff. Simple, focused tools that excel at a single task can be incredibly useful, especially when that task is repetitive. My advice would be to keep this tool simple and unbloated.

    My only specific suggestion would apply when appending data to a file. If I were appending multiple tables into the same file, I could see myself easily losing track of which data was which if there were nothing to uniquely identify each table, so perhaps providing the ability for the user to type in a tag that would appear in a column besides each table would address that problem. For example, I might enter 7% to identify the table I generated for a 7% mortgage, etc.

    Keep up the great work.

  112. Hi Jon,

    I have been doing the same thing for years now: Always writing small tools for myself – many of them extracting text for this or that purpose. But usually I was lazy to make one big generic tool.
    String parsing is fun, but can be tricky because it always depends on the context.

    I notice that this tool is made for manual use, there is little automation. Often data is spread over many tables – you will have to click, paste, append over and over.

    How about a simple spidering routine that will read all pages of the table and append automatically. (No, it doesn’t need to follow all the links – just the first link below the table and only if the target URL matches the current URL in some way – like only one parameter differs)

    - Extracting keywords that are in a list you can configure – while recursively spidering the site: Find out if the site somoewhere contains certain words, phrases or links…

    - Extract images: ok, there is software for that: I have been using TeleportPro – but how about extracting only images that match a certain pattern?

    - Automatically detect a certain pattern so you won’t have to manually select where the data is: Like detect prices (currency sign is close left or right), stock symbols, link lists, or other tables of data whenever they appear anywhere.

    - detect words in a text and determine word or phrase usage stats: This could be used to find what topic pages/sites are about and if they are human-written or just machine-generated content.

    I hope this helps…

  113. Hi Jonathan,

    Useful tool!

    I would use it to capture data from Google’s External keyword tool.

    Google lets you save the keywords, but not the rest of the data, like
    search volume, CPC, trends, etc.

    When running multiple PPC campaigns having this info handy and printed would be a big help, rather than having to run the keywords each time.

    Based on this use, you could sell alot of these!

    Good luck, look forward to getting this!

    Howard “OutSourcerer” Tiano

  114. It is a shame we did not get to see the Google Keywords Tool in the demo. Grabbing the data and knowing the settings for the page at the same time would be great – like whether it was phrase or site, whether synonyms was switched on or other pages on the site, the sort order, the root phrase being analysed, and whether the table was for the Core Related words or the Additional Words plus of course which currency.
    That said, as it stands it looks pretty useful. I can see myself using it – but not every day.

  115. i could use a feature that would
    a) follow all links all pages
    ie if you had results that should 20 pages of prices, or 50 pages of states, it would be nice if the program automatically followed all the links and appended as data collected
    b) it would be helpful if there was a column added to data collected that cited the source url, ie these 1st 25 rows came from url #1, rows 26-50 came from url #2
    c) upon rechecking a previously data collected page, it would be nice to see only NEW data, ie if you are monitoring an ebay page, you only want to see data that was not previously collected
    d) the ability to skip columns of data collection, ie 5 columns, but i only need column 1 and 5 – not the middle column info
    there is more, but for starters…

  116. This tool is great. The ability to automate its use, for example to visit a website once a day and extract and append data to a file, would make it much more useful.

  117. i have used several data extraction programs

    how can i get a copy to demo
    thanks :)

  118. hit Submit to quickly…

    Jonathan, I like the tools you use, a lot….have bought many and still use many including your linking programs…but this one might not bring as much value as the others….

  119. I will admin that I do not try and copy tables often enough, but you can absolutely copy table information into excel and with just a few clicks remove all spaces, comma’s.

    All you do is copy the table, head over to your excel spreadsheet and instead of clicking “paste”, you choose “paste special” and choose text…

    your done

    You can then use the replace feature to remove any data you do not want, including spaces. Simply open the “Replace” function, put your cursor in the first text box, hit the space bar 1 time and then leave the bottom text box blank….click replace and your done. You just told Excel to replace any single spaces with nothing.

  120. hi Jonathan,
    Creative stuff as per your usual. I don’t see a use for me at this stage though and can’t think of what else to suggest you add.
    All the best with the development.

  121. whatever you do, please make this free or super duper cheap. I seem to have an addiction to Jonathan Leger products and I’m tired of spending money! :) luv your stuff, thanks!

  122. I like the idea of this tool but I think that having a macro scripting function would be handy. THat way I could ‘teach’ the script to go to a page, hit the table, extract the data and save it.

    If this could then be run on a timed basis then I would be a very happy man. ;)
    This would mean I could build time series data – stuff upating over time without me needing to go get the info every day, hour, whatever.

    Andrew.

  123. It would be good if it could extract all the pages inside a password protected forum.

  124. Cool Jon!

    Can you make it extract data from a RSS feed, like the title and descriptions etc. That can be useful to write some more articles…..

  125. Great tool even as is!

    Some suggestions:

    1. Allow it to remove selected columns.

    2. Not sure if it only parses html tables – if so, parsing css defined tables as well would certainly be an awesome feature, too.

    3. Parse SERPs: in their entirety, links + descriptions only, etc.

  126. You have got to be kidding me ! You have done it again. I had always wanted to get data right off a page that quick. ( talk about saving time ) but could not .

    How about gathering the headline text ( you know the ones you can’t copy and paste ) of every offer or web page
    that you get in you email. Thus creating a file of headlines and or benefits to use in your own marketing.

    got to hand it to you : matthew w faulkner

  127. Hi Jonathan,

    Great tool…

    Pulling datafeeds off of a CJ download would be fantastic!

    All the best,

    Steve

  128. Great tool Jonathan. I think that you have covered most of the important bases, and should be useful to most.
    Regards and best of luck.

  129. wish we can have a free try-out

  130. Extracting from pdf files would be great.

    Also, selecting specific data cells as opposed to the entire table.

    The link extraction tool would be great if in addition would create the text embeded with the link address, as shown on original file. This would save additional time.

    Hope to see it on the market soon.

    Edison

  131. Awesome job, Jonathan. I’d like to see the tool be able to extract entire chunks of data. For instance, if I’m looking for a list of people in a certain group or profession, and I find all their data grouped by individual (e.g. name, company, address, phone, email address, etc.), there’s no easy way to extract that data to a usable format. I’d love to see if you can do it.

  132. Hi John,

    Nice tool. I do a lot of this by hand so I think if the tool is priced right I would consider it.

    As someone else said, it would be nice to drop the info directly into excel.

    Also, if the tool can be used as a capture – such as any table, block of data or the whole web page ( like snag-it does) and save it as an image that would defintely be a useful tool.

    My “2″,

    Chuck

  133. Looks great.

    How difficult would it be to make the data dragable to a csv file rather than having to forever locate the saved files you want to append?

  134. Also, just read a comment above that reminded me – can it be saved in any format other than csv, and exported to Excel or MSword?

  135. I really love your article wizard – used it today in fact, and it really makes life easy. What I would love is is we can choose datafields to copy. For example, there are some tables that I have that have 100′s of fields, but I only want to copy a small cross-section. Because of the volume of the file, if all the data is exported, it becomes too cumbersome to manage to edit. Any way to pre-define copy-able fields?
    Also, does it only work on tables? Can it extract text/content from selected pdf pages? I would LOVE that!

  136. Thanks for the great ideas guys. Keep ‘em coming. I’m reading all of the comments and getting some good ideas.

  137. I always love the tools you design, Jonathan! They are simple to use and powerful. One of my tasks is to identify specific email addresses of key personnel in for example – pharma companies. Is there a way to add this capability?
    Best,
    Patrick McCall

  138. Of course extracting from PDFs seems beyond the scope of this
    tool..but nonetheless a very powerful function.

    The ability to extract text and other data in the form of “questions” that are asked on sites such as Yahoo Answers is incredibly
    useful for niche research and other applications also.

    Regards,

    Hanif

  139. Here is what I need and would DIE for: I use Popshops.com for my “store” on my site. It uses PHP for this to populate more than just one page. I want to add links and products on my article pages but I want the person to be able to click the link of the product and go back to a page on my site with more of the same product or the page that the product came from. So, I would like to get data from a page like this: http://myinteriordecorator.com/store/Dressers-and-Changing-Tables.php

    And then selected one or two of the products to put on a page about decorating a nursery using changing tables and dressers. When the link, photo, etc. is clicked, it would take the visitor BACK to the page above that has ALL the changing tables and dressers for nurseries listed.

    Does that make sense?

    The other way I might use it is taking the data from each of these main pages in my “store”, creating a database on the back end to use some custom scripts I have had created for me.

    Let me know if it will work for this. It may not actually. :(

    Cool product.

    Rhonda

  140. Could it extract data from graphs? like economic graph or graphs of new properties sold etc.

  141. Great tool jonathon
    What I would like to be able to do is extract data from commission Junction so I can get affiliate links and product links faster and easier than it is currently.

  142. This is a great tool. If it could extract from the entire page not just tables that would be great too.

    Also If it could add extracted data and links to a feed that can be added to a web page.

    If it could add the data and links to a MySQL data base that would be great too.

    I’ve had to mode the heck out of one script just to make two data bases talk to each other. If you could make it do that as well this software would be fabulous

  143. My biggest problem is extracting data from PDF files. Is is possible for your program to read those pages?

  144. when exporting make sure it allows me to open with (excel) so I don’t have to save it to a file, open excel, find the file etc.

    Also, on the links option include the ability to include the link, and anchor text together with html href coding already in place ready to paste into html code on a page.

  145. Really looks like a useful tool.

    It would be fantastic if it also work on PDF’s.

  146. GREAT! You have solved my pain.

  147. Great tool. I can see it also being useful for extracting and then extracting images as an option.

    For example, I would go to images.google.com. I search for images of old time radio. I use the tool to select all the image links. I then push a button to download all the images.

  148. Jonathan, maybe you could add a feature where the data will actually be fed into an Access Database? I think it would be easier to manipulate the data and draw conclusions that way.

    Overall, it looks like a good tool. But, I think I would need to understand how this tool will help me before I would want to buy it.

    Perhaps you could show examples of how collecting this data has helped you – examples of exactly what you used this data for. That would be proof of the actual value of this tool.

  149. Jon,

    Simple but VERY effective piece of software! I can see this saving me a fair bit of time on Keyword Analysis,

    Keep em coming,

    Darren

  150. Hi Jon,

    This looks like a useful tool. I’m also finding that extracting data from tables is sometimes difficult. I’ve found several quick workarounds that mostly do the trick, but there are some tables that nothing short of tedious, mind-numbing data manipulation will help. The most immediate example is the affiliate link table in AdSpy Pro. I can look at it on screen, but everything comes out in the wrong column from the headers with whole rows between data that belong together. Frustrating. If this tool can handle that table, it would be worth looking at.


Trackbacks are disabled.