.\"
.\" webnew.man,v 1.5 2001/05/27 22:08:01 kim Exp
.\"
.TH WEBNEW 1 "@WEBNEWDATE@" "TAC @WEBNEWVER@"
.SH NAME 
.B webnew
- Retrieve modification times of HTTP documents
.SH SYNOPSIS
.B webnew
.RB [ \-PRVadinrvx ]
.RB [ \-A
.IR username:password ]
.RB [ \-c
.IR type ]
.RB [ \-e
.IR email ]
.RB [ \-t
.IR title ]
.I URL
.br
.SH DESCRIPTION
.B webnew
produces a listing of URLs (web documents) sorted by the last
modification time as reported by the HTTP server.  It produces
by default a HTML 2.0 document on standard output.  The URL on
the command line is used as a starting point.
.PP
By default the URLs to include in the listing are extractred from
the document specified by the URL.  For a recursive search of URLs
to include, please see the \fB\-R\fR and \fB\-r\fR options.
.SH OPTIONS
.TP
.B \-A
Use the provided \fIusername\fR and \fIpassword\fR using basic
authentication.  This is only needed for password protected
documents.
.TP
.B \-P
Do not use proxies to access the documents.  By default proxy
definitions are used from the standard environment variables.
.TP
.B \-R
Become a "robot" and turn on \fB\-r\fR.  To restrict the retrieval
of documents, you can use a "/robots.txt" file on your server (the
user agent name for \fBwebnew\fR is "webnew").
.TP
.B \-V
Print the version of \fBwebnew\fR and exit.
.TP
.B \-a
Use the text of the first anchor found pointing to each URL as
the acnhor text in the produced listing.  The default is to
prefer the title specified in the document.  Using this option
will considerably speed up non-recursive listings, as the
individual documents will not be retrieved at all.
.TP
.B \-c
Specify a regular expression to match for the content-type of
documents included in the listing.  Default is "text".
.TP
.B \-d
Output a trace of the stack of URLs to retrieve.  Automatically
turns on \fB\-v\fR.
.TP
.B \-e
Use the given email address in the HTTP requests.  Also causes
a <LINK REV=MADE> tag to be included in the HTML output.
.TP
.B \-i
Only output the unordered URL items.  This produces HTML that
should not be served as a standalone document.  It is intended
for including the output inside another HTML file.
.TP
.B \-n
Report URLs that no modification date was retrieved for.
.TP
.B \-r
Use the specified URL as the initial URL to include in the
listing.  Then retrieve that document and extract URLs from it
to be further included and retrieved.  Only URLs beginning with
the initial URL will be retrieved (to avoid infinite listings).
This is very useful for completely automatic "what's new"
listings.
.TP
.B \-t
Set the title and top level heading to the given text.  The
default title is "What's new".
.TP
.B \-v
Show retrieved document URLs, their modification times (if it
was reported by the server).  If the URL was not searched for
more links, the reason is reported in parentheses.
.TP
.B \-x
Exclude pointers to the home page of \fBwebnew\fR from the
output.  If you use this option, please make sure you provide
a pointer to the home page in some other fashion.  The URL for
\fBwebnew\fR is @WEBNEWURL@ and it
will always contain a pointer to the most recent version of
the software as well as installation and use instructions.
.SH EXAMPLES
.na
.nf
mv new.html old.html
.br
webnew -a http://www.tac.nyc.ny.us/kim/old.html > new.html
.PP
webnew -r http://www.tac.nyc.ny.us/kim/ > new.html
.fi
.ad
.SH BUGS
No known bugs.
.SH AUTHOR
Kimmo Suominen <kim@tac.nyc.ny.us>
.SH SEE ALSO
.IR urlget (1)
.PP
Please read the document "A Standard for Robot Exclusion"
for more information on restricting robots.
.na
.nf
http://www.robotstxt.org/wc/norobots.html
.fi
.ad
