This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: html parser location:


> According to Tamirat Tesfaye on 8/28/2008 11:49 PM:
>> Dear Eric, I am working on an application that parses the content of
>> HTML remotely and populate it in database. I have used  perl for this.
>> However the cygwin was saying that it was unable to locate the HTML
>> Content Parser  files. I have tried to reinstall the cygwin from several
>> mirrors to check its problem. However I was in vain. Would you please
>> lend a hand on this matter.
>
> I don't know why you picked me to send your private mail to, but it was
> the wrong choice.  Redirecting to the list.  And consider following the
> directions at http://cygwin.com/problems.html next time.

The cygwin list can also not help.
You need a perl module and perl help from the locations recommended in
the perl docus.

$ perldoc perlfaq
is a good start.

cygwin perl does only come with the core modules plus some additional modules
so that CPAN works out of the box.
If it does not come with "HTML Content Parser files", you have to install
them by your own via cpan.

There are several modules for parsing HTML.
Search them at http://search.cpan.org/
http://search.cpan.org/search?query=HTML+Content+Parser&mode=all
HTML::Parser  e.g. is often used.

And as it turns out HTML-Parser-3.56, XML-Parser-2.36, XML-SAX-0.16
and XML-LibXML-1.66 are already included with cygwin perl.
Other modules should be installed via cpan.
-- 
Reini Urban
http://phpwiki.org/ http://murbreak.at/

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]