<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p><font size="-1">A practical way to proceed may be to have LLVM
provide an html file list from their server by going to the top
level <a class="moz-txt-link-freetext" href="https://llvm.org">https://llvm.org</a> directory and executing the following
command</font></p>
<p><font size="-1">find . -name '*.htm?' > llvm.org_html_file_list<br>
</font></p>
<p><font size="-1">giving all file names with parent directories for
extensions with html or htm. It may be that there are multiple
top directories of interest, such as one for
<a class="moz-txt-link-freetext" href="https://clang.llvm.org/">https://clang.llvm.org/</a>, that could also be put into their own
file lists, though this is secondary at the moment. Having the
name of that top level directory in each case may help or the
top level web-page name could work. We just need to be sure the
changes get back to the proper directory. Tar or zip the list(s)
for easy download.<br>
</font></p>
<p><font size="-1">The LLVM html files could then be downloaded to a
local user's computer from the list using wget, the analysis
done and the changes made. The changes could then be uploaded to
<a class="moz-txt-link-freetext" href="https://bugs.llvm.org">https://bugs.llvm.org</a> using diff files as patches or as LLVM
directs.<br>
</font></p>
<p><font size="-1">Without the file lists from LLVM for this local
procedure, the only option would be to remove the html link tags
for the dead-links, which removes an easy ability to make
corrections, if can be done, to those links. This procedure be
done by downloading the </font><font size="-1"><font size="-1">LLVM</font>
site's html pages through page links with wget. Since possibly
useful information is lost with this procedure it is not likely
a preferred option.<br>
</font></p>
<p><font size="-1">The first option, without parent pages for the
dead-links below, would tend to require the download of possibly
all or most of the html files in the list in order to find those
few of concern. Whether or not there are copyright or other
issues with downloading large chunks of the LLVM site may be
considered.</font></p>
<p><font size="-1">There is an option in wget when downloading a
site to change all the links to local files in a manner Patrick
suggests that may obtain that objective. Considering the scale
of that change it would best be done on the LLVM server in the
manner of a copy with changes using wget and then directing a
browser to the copy to see that result before going live. It may
be the case that wget would not work or further link changes
done with a program would be required. It would be easy to
redirect back to the prior LLVM site if critical problems were
found later. But the scale of this change suggests it would be
done with more detailed consideration at LLVM as against the
relatively few dead-link changes to this point identified that
could be addressed with diff uploads.</font></p>
<p><font size="-1">The option for writing a program for the
dead-link analysis and changes seems less likely in that the
programmer would need to write for an environment not
immediately available to him and a program would not allow the
more incremental and clear visibility of diff uploads.<br>
</font></p>
<p><font size="-1">Regards, Neil Nelson<br>
</font></p>
<div class="moz-cite-prefix"><font size="-1">On 9/1/19 4:33 AM,
Patrick Nappa wrote:</font><br>
</div>
<blockquote type="cite"
cite="mid:CAAMHP5P=D521bXa+8u5ZO9yzNfEF+dRos3ygEK=6CY-e=2wy0g@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p><font size="-1">It would be a fairly direct procedure to
associate links by their file name (less path) with file
locations. The process would then update the links for
the correct paths, list links without an existing file,
and list dead links having more than one existing file
with the same name.</font></p>
</div>
</blockquote>
<div>and<br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Patrick,
how long does the crawl take? I suspect if we fixed internal
documentation links so that they point to local copies of
documentation when building locally it would be quite quick
(no actual idea though).</blockquote>
<div> </div>
<div>That crawl was actually done on the live site, using the
linkchecker tool. <br>
</div>
<div><br>
</div>
<div>Doing it locally would indeed be much better, and it turns
out Sphinx has a builtin tool for doing such a check (`cd
llvm/docs && make -f Makefile.sphinx linkcheck`), but
also checks external hyperlinks are reachable. Now, the
runtime for this can be seriously reduced if we change all
internal document links to actually point to internal document
links (i.e. link to /docs/foo/bar, rather than <a
href="https://llvm.org/docs/foo/bar" moz-do-not-send="true">https://llvm.org/docs/foo/bar</a>,
or <a href="http://llvm.org/docs/foo/bar"
moz-do-not-send="true">llvm.org/docs/foo/bar</a> - easily
fixable), so as to avoid an internet check. I do believe we
should check external links still, as having documentation
link to nowhere can be jarring, however I don't think such
crawls need to be as frequent.<br>
</div>
<div><br>
</div>
<div>Cheers,</div>
<div>Patrick<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Thu, Aug 29, 2019 at 7:20
PM James Henderson via llvm-dev <<a
href="mailto:llvm-dev@lists.llvm.org" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div>Patrick, how long does the crawl take? I suspect if we
fixed internal documentation links so that they point to
local copies of documentation when building locally it
would be quite quick (no actual idea though). That in turn
would probably make it feasible to add to the existing
documentation build bots, I think.</div>
<div><br>
</div>
<div>James<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Thu, 29 Aug 2019 at
03:47, Neil Nelson via llvm-dev <<a
href="mailto:llvm-dev@lists.llvm.org" target="_blank"
moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p><font size="-1">Patrick, You have identified a good
way to do this. Given it is likely that the links
are to files in a directory structure on a single
server with that file structure/path given by the
link text, as we see in your dead link list, and
that in a good number, perhaps likely a large
majority of the cases, that the file names (less the
directory path) are unique,</font></p>
<p><font size="-1">It would be a fairly direct procedure
to associate links by their file name (less path)
with file locations. The process would then update
the links for the correct paths, list links without
an existing file, and list dead links having more
than one existing file with the same name.</font></p>
<p><font size="-1">The frequency of that run would
depend on the frequency of dead-link discovery that
the run could provide.<br>
</font></p>
<p><font size="-1">Regards, Neil Nelson</font><br>
</p>
<div
class="gmail-m_2125254870880644934gmail-m_7144157776072443842moz-cite-prefix">On
8/28/19 7:52 PM, Patrick Nappa via llvm-dev wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Hi all,</div>
<div><br>
</div>
<div>I'm currently in the process of updating the
Kaleidoscope tutorials (first and foremost, the
ORC/BuildingAJIT ones), and I've noticed a fair
few 404s which are lingering within the current
visible documentation. Some of these don't seem to
have linked to existing pages for a while.</div>
<div><br>
</div>
<div>I was wondering if there was a way to set up a
check in the buildbot to ensure that documentation
doesn't break between builds? I'm happy to fix the
current dead links I've found (see below) but
thought it might be wise to set up a more
automated approach in the future. Does anyone have
any tips on how I'd go about doing this/if this
should be set up at all?<br>
</div>
<div><br>
</div>
<div>I ran a web crawler to find each of the dead
links (this may not be exhaustive), and they are
as follows: <br>
</div>
<div><span style="font-family:monospace"><a
href="https://llvm.org/docs/TestSuiteMakefileGuide"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/TestSuiteMakefileGuide</a></span><br>
<span style="font-family:monospace"><a
href="https://llvm.org/docs/doxygen/structLICM.html"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/doxygen/structLICM.html</a><br>
<a
href="https://llvm.org/docs/tutorial/LangImpl5.html#for-loop-expression"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/LangImpl5.html#for-loop-expression</a><br>
<a
href="https://llvm.org/docs/tutorial/LangImpl7.html#user-defined-local-variables"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/LangImpl7.html#user-defined-local-variables</a><br>
<a href="http://llvm.org/docs/lnt/modindex.html"
target="_blank" moz-do-not-send="true">http://llvm.org/docs/lnt/modindex.html</a><br>
<a
href="https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl6.html#user-defined-unary-operators"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl6.html#user-defined-unary-operators</a><br>
<a
href="https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl5.html#for-loop-expression"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl5.html#for-loop-expression</a><br>
<a
href="https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl7.html#user-defined-local-variables"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl7.html#user-defined-local-variables</a><br>
<a
href="https://llvm.org/docs/tutorial/LangRef.html#instruction-reference"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/LangRef.html#instruction-reference</a><br>
<a
href="https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl4.html#adding-a-jit-compiler"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl4.html#adding-a-jit-compiler</a><br>
<a
href="https://llvm.org/docs/tutorial/WritingAnLLVMPass.html"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/WritingAnLLVMPass.html</a><br>
<a
href="https://llvm.org/docs/tutorial/Passes.html"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/Passes.html</a><br>
<a
href="https://llvm.org/docs/tutorial/ProgrammersManual.html#viewing-graphs-while-debugging-code"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/ProgrammersManual.html#viewing-graphs-while-debugging-code</a><br>
<a
href="https://llvm.org/docs/tutorial/SourceLevelDebugging.html"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/SourceLevelDebugging.html</a><br>
<a
href="https://llvm.org/docs/tutorial/Frontend/PerformanceTips.html"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/Frontend/PerformanceTips.html</a><br>
<a
href="https://llvm.org/docs/tutorial/GetElementPtr.html"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/GetElementPtr.html</a><br>
<a
href="https://llvm.org/docs/tutorial/GarbageCollection.html"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/GarbageCollection.html</a><br>
<a
href="https://llvm.org/docs/tutorial/ExceptionHandling.html"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/ExceptionHandling.html</a><br>
<a
href="https://www.llvm.org/docs/doxygen/structLICM.html"
target="_blank" moz-do-not-send="true">https://www.llvm.org/docs/doxygen/structLICM.html</a><br>
<a
href="http://llvm.org/docs/TestSuiteMakefileGuide"
target="_blank" moz-do-not-send="true">http://llvm.org/docs/TestSuiteMakefileGuide</a><br>
<a
href="http://llvm.org/docs/doxygen/structLICM.html"
target="_blank" moz-do-not-send="true">http://llvm.org/docs/doxygen/structLICM.html</a><br>
<a
href="https://www.llvm.org/docs/TestSuiteMakefileGuide"
target="_blank" moz-do-not-send="true">https://www.llvm.org/docs/TestSuiteMakefileGuide</a><br>
<a
href="http://llvm.org/docs/tutorial/LangImpl5.html#for-loop-expression"
target="_blank" moz-do-not-send="true">http://llvm.org/docs/tutorial/LangImpl5.html#for-loop-expression</a><br>
<a
href="http://llvm.org/docs/tutorial/LangImpl7.html#user-defined-local-variables"
target="_blank" moz-do-not-send="true">http://llvm.org/docs/tutorial/LangImpl7.html#user-defined-local-variables</a></span>
</div>
<div><br>
</div>
<div>Some of these are trivial mistakes (i.e. <a
href="https://llvm.org/docs/tutorial/LangRef.html#instruction-reference"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/tutorial/LangRef.html#instruction-reference</a>
-> <a
href="https://llvm.org/docs/LangRef.html#instruction-reference"
target="_blank" moz-do-not-send="true">https://llvm.org/docs/LangRef.html#instruction-reference</a>),
and some require a bit more inspection.<br>
</div>
<div><br>
</div>
<div>Regards,</div>
<div>Patrick<br>
</div>
</div>
<br>
<fieldset
class="gmail-m_2125254870880644934gmail-m_7144157776072443842mimeAttachmentHeader"></fieldset>
<pre class="gmail-m_2125254870880644934gmail-m_7144157776072443842moz-quote-pre">_______________________________________________
LLVM Developers mailing list
<a class="gmail-m_2125254870880644934gmail-m_7144157776072443842moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>
<a class="gmail-m_2125254870880644934gmail-m_7144157776072443842moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
</div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank"
moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br>
<a
href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote>
</div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank"
moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br>
<a
href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote>
</div>
</blockquote>
</body>
</html>