<div dir="ltr"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p><font size="-1">It would be a fairly direct procedure to
associate links by their file name (less path) with file
locations. The process would then update the links for the
correct paths, list links without an existing file, and list
dead links having more than one existing file with the same
name.</font></p></div></blockquote><div>and<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Patrick, how long does the crawl take? I suspect if we fixed internal
documentation links so that they point to local copies of documentation
when building locally it would be quite quick (no actual idea though).</blockquote><div> </div><div>That crawl was actually done on the live site, using the linkchecker tool. <br></div><div><br></div><div>Doing it locally would indeed be much better, and it turns out Sphinx has a builtin tool for doing such a check (`cd llvm/docs && make -f Makefile.sphinx linkcheck`), but also checks external hyperlinks are reachable. Now, the runtime for this can be seriously reduced if we change all internal document links to actually point to internal document links (i.e. link to /docs/foo/bar, rather than <a href="https://llvm.org/docs/foo/bar">https://llvm.org/docs/foo/bar</a>, or <a href="http://llvm.org/docs/foo/bar">llvm.org/docs/foo/bar</a> - easily fixable), so as to avoid an internet check. I do believe we should check external links still, as having documentation link to nowhere can be jarring, however I don't think such crawls need to be as frequent.<br></div><div><br></div><div>Cheers,</div><div>Patrick<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Aug 29, 2019 at 7:20 PM James Henderson via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Patrick, how long does the crawl take? I suspect if we fixed internal documentation links so that they point to local copies of documentation when building locally it would be quite quick (no actual idea though). That in turn would probably make it feasible to add to the existing documentation build bots, I think.</div><div><br></div><div>James<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 29 Aug 2019 at 03:47, Neil Nelson via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p><font size="-1">Patrick, You have identified a good way to do
this. Given it is likely that the links are to files in a
directory structure on a single server with that file
structure/path given by the link text, as we see in your dead
link list, and that in a good number, perhaps likely a large
majority of the cases, that the file names (less the directory
path) are unique,</font></p>
<p><font size="-1">It would be a fairly direct procedure to
associate links by their file name (less path) with file
locations. The process would then update the links for the
correct paths, list links without an existing file, and list
dead links having more than one existing file with the same
name.</font></p>
<p><font size="-1">The frequency of that run would depend on the
frequency of dead-link discovery that the run could provide.<br>
</font></p>
<p><font size="-1">Regards, Neil Nelson</font><br>
</p>
<div class="gmail-m_2125254870880644934gmail-m_7144157776072443842moz-cite-prefix">On 8/28/19 7:52 PM, Patrick Nappa via
llvm-dev wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Hi all,</div>
<div><br>
</div>
<div>I'm currently in the process of updating the Kaleidoscope
tutorials (first and foremost, the ORC/BuildingAJIT ones), and
I've noticed a fair few 404s which are lingering within the
current visible documentation. Some of these don't seem to
have linked to existing pages for a while.</div>
<div><br>
</div>
<div>I was wondering if there was a way to set up a check in the
buildbot to ensure that documentation doesn't break between
builds? I'm happy to fix the current dead links I've found
(see below) but thought it might be wise to set up a more
automated approach in the future. Does anyone have any tips on
how I'd go about doing this/if this should be set up at all?<br>
</div>
<div><br>
</div>
<div>I ran a web crawler to find each of the dead links (this
may not be exhaustive), and they are as follows: <br>
</div>
<div><span style="font-family:monospace"><a href="https://llvm.org/docs/TestSuiteMakefileGuide" target="_blank">https://llvm.org/docs/TestSuiteMakefileGuide</a></span><br>
<span style="font-family:monospace"><a href="https://llvm.org/docs/doxygen/structLICM.html" target="_blank">https://llvm.org/docs/doxygen/structLICM.html</a><br>
<a href="https://llvm.org/docs/tutorial/LangImpl5.html#for-loop-expression" target="_blank">https://llvm.org/docs/tutorial/LangImpl5.html#for-loop-expression</a><br>
<a href="https://llvm.org/docs/tutorial/LangImpl7.html#user-defined-local-variables" target="_blank">https://llvm.org/docs/tutorial/LangImpl7.html#user-defined-local-variables</a><br>
<a href="http://llvm.org/docs/lnt/modindex.html" target="_blank">http://llvm.org/docs/lnt/modindex.html</a><br>
<a href="https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl6.html#user-defined-unary-operators" target="_blank">https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl6.html#user-defined-unary-operators</a><br>
<a href="https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl5.html#for-loop-expression" target="_blank">https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl5.html#for-loop-expression</a><br>
<a href="https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl7.html#user-defined-local-variables" target="_blank">https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl7.html#user-defined-local-variables</a><br>
<a href="https://llvm.org/docs/tutorial/LangRef.html#instruction-reference" target="_blank">https://llvm.org/docs/tutorial/LangRef.html#instruction-reference</a><br>
<a href="https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl4.html#adding-a-jit-compiler" target="_blank">https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl4.html#adding-a-jit-compiler</a><br>
<a href="https://llvm.org/docs/tutorial/WritingAnLLVMPass.html" target="_blank">https://llvm.org/docs/tutorial/WritingAnLLVMPass.html</a><br>
<a href="https://llvm.org/docs/tutorial/Passes.html" target="_blank">https://llvm.org/docs/tutorial/Passes.html</a><br>
<a href="https://llvm.org/docs/tutorial/ProgrammersManual.html#viewing-graphs-while-debugging-code" target="_blank">https://llvm.org/docs/tutorial/ProgrammersManual.html#viewing-graphs-while-debugging-code</a><br>
<a href="https://llvm.org/docs/tutorial/SourceLevelDebugging.html" target="_blank">https://llvm.org/docs/tutorial/SourceLevelDebugging.html</a><br>
<a href="https://llvm.org/docs/tutorial/Frontend/PerformanceTips.html" target="_blank">https://llvm.org/docs/tutorial/Frontend/PerformanceTips.html</a><br>
<a href="https://llvm.org/docs/tutorial/GetElementPtr.html" target="_blank">https://llvm.org/docs/tutorial/GetElementPtr.html</a><br>
<a href="https://llvm.org/docs/tutorial/GarbageCollection.html" target="_blank">https://llvm.org/docs/tutorial/GarbageCollection.html</a><br>
<a href="https://llvm.org/docs/tutorial/ExceptionHandling.html" target="_blank">https://llvm.org/docs/tutorial/ExceptionHandling.html</a><br>
<a href="https://www.llvm.org/docs/doxygen/structLICM.html" target="_blank">https://www.llvm.org/docs/doxygen/structLICM.html</a><br>
<a href="http://llvm.org/docs/TestSuiteMakefileGuide" target="_blank">http://llvm.org/docs/TestSuiteMakefileGuide</a><br>
<a href="http://llvm.org/docs/doxygen/structLICM.html" target="_blank">http://llvm.org/docs/doxygen/structLICM.html</a><br>
<a href="https://www.llvm.org/docs/TestSuiteMakefileGuide" target="_blank">https://www.llvm.org/docs/TestSuiteMakefileGuide</a><br>
<a href="http://llvm.org/docs/tutorial/LangImpl5.html#for-loop-expression" target="_blank">http://llvm.org/docs/tutorial/LangImpl5.html#for-loop-expression</a><br>
<a href="http://llvm.org/docs/tutorial/LangImpl7.html#user-defined-local-variables" target="_blank">http://llvm.org/docs/tutorial/LangImpl7.html#user-defined-local-variables</a></span>
</div>
<div><br>
</div>
<div>Some of these are trivial mistakes (i.e. <a href="https://llvm.org/docs/tutorial/LangRef.html#instruction-reference" target="_blank">https://llvm.org/docs/tutorial/LangRef.html#instruction-reference</a>
-> <a href="https://llvm.org/docs/LangRef.html#instruction-reference" target="_blank">https://llvm.org/docs/LangRef.html#instruction-reference</a>),
and some require a bit more inspection.<br>
</div>
<div><br>
</div>
<div>Regards,</div>
<div>Patrick<br>
</div>
</div>
<br>
<fieldset class="gmail-m_2125254870880644934gmail-m_7144157776072443842mimeAttachmentHeader"></fieldset>
<pre class="gmail-m_2125254870880644934gmail-m_7144157776072443842moz-quote-pre">_______________________________________________
LLVM Developers mailing list
<a class="gmail-m_2125254870880644934gmail-m_7144157776072443842moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>
<a class="gmail-m_2125254870880644934gmail-m_7144157776072443842moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
</div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div>