I implemented a TeX engine in WebAssembly so you really can run TeX in the browser. You can see a demo of this at https://tex.rossprogram.org/ and at https://github.com/kisonecat/web2js you can find a Pascal compiler that targets WebAssembly which can compile Knuth's TeX. Interesting primitives like \directjs are also implemented, so you can execute javascript from inside TeX. The rendering is handled with https://github.com/kisonecat/dvi2html for which I finally fixed some font problems.
To make it relatively fast, the TeX engine gets snapshotted and shipped to the browser with much of TeXlive already loaded. So even things like TikZ work reasonably well. There is of course a lot more to do! The plan is to convert ximera.osu.edu to this new backend by the fall.
You can use the same code for server-side rendering: with dvi2html, you can run TeX on a .tex file to get .dvi and then produce a static html file. This renders not just the math but also the text using TeX's layout engine, which is often not what people want. Running it in the client means the TeX knows the width of the view.
>> Running it in the client means the TeX knows the width of the view.
+1 for running it server side. But knowing the width of the view was never supposed to be a thing with HTML and the web. That's what the layout engine in the browser is for. Also, knowing those metrics is a bit or two of fingerprinting information that the server should never be made aware of in the first place.
Unfortunately, the current platform we're using isn't very accessible: the \answer{...} blanks aren't converted to speech and don't navigate correctly. The new backend parses some TeX anyway to an AST, and the idea is that we could use that to produce alt text for the math. There's certainly a lot to do.
Sorry to sidetrack, but I'm also interested in implementing a Tex engine (in another language though). Any suggestions on how to learn how an engine works?
TeX's source code is available in a "literate" format, so you can download tex.web, run weave tex.web, and then pdftex tex.tex. This will produce a large book with the TeX source code. A challenge is that the resulting TeX engine won't support LaTeX3, so you'll need to use "changefiles" to modify tex.web to add epsilon-TeX and other updates.
The "server-side rendering" aspect of KaTeX is (and always has been, as far as I'm aware) front-and-centre on their landing page [1]. It allows you to input LaTeX and output HTML ready for inclusion server-side - the only thing this HTML needs to support it is the KaTeX CSS. This is very easy to do if your web server or static site generator is running on Node, and only a bit more difficult if it's running on something else (and therefore needs to shell out to Node). So if the author wants a good solution to server-side rendering, just look a little more at KaTeX.
On the flip side, I have reverted some of my server-side rendering of mathematics back to client-side rendering, because of considerations like webpage size. On mathematics-heavy pages, I found that pages that would otherwise be about 50KB in size got inflated up to about 1MB after server-side rendering all of the mathematics. After compression the difference was more like 70K, but this difference is the entire size of the (compressed) KaTeX library. I think it is completely reasonable to only transmit LaTeX markup over the wire, and have a client-side library take care of the presentation (as we do for HTML, SVG, ...).
I've also investigated MathML, but cross-browser support is terrible and has been for years. You also still get the size explosion problem, because LaTeX markup is just so much more compact than whatever MathML soup is equivalent.
MathJax can also (apparently) render MathML on Chrome.
Maybe the answer is:
Write maths notation on your site with MathML. (You'd probably want to preprocess LaTeX notation into MathML some way, because MathML isn't fun to write by hand in the same way that septic tanks aren't fun to unblock by hand.) This will be displayed natively in Safari/Firefox, and be accessible to screen readers (apparently.. I don't use one so don't know what it's like in practice).
Serve MathJax to Chrome users so they can see your maths.
Well I hope that you took efforts to avoid the other nasty side effect of client side rendering at least: page content jumping all over the place when the page reloads and the scrollbar not initializing at the right location (so cross page anchors are useless).
I personally solved that problem by stuffing the equation into a separate vue component and putting the katex compilation into the render function. Indeed, a dedicated equation HTML tag supporting latex (with what packages though?) would be nice. Getting the equation numbering to work nicely in an automatic fashion was a bit annoying though. I appreciate your comment on the page size, since I was considering moving to server side rendering via nuxt and hadn't made this consideration.
I've not used Vue, but how does this solve the problem? In most of my own sites, whatever preprocessor (eg markdown -> html) I use is instructed to replace mathematics with something like <span class="math">z = \frac{x}{y}</span>, then after the page has loaded a small loop runs KaTeX on all eligible spans. This results in some small amount of jumping, but I've found anchors to still work quite well.
Vue is an SPA framework, so it solves the problem by "rendering all the html" as actual preprocessing and then serving it to the browser which visually renders it, as opposed to the usual document.onLoad() approach where the base page gets loaded and then the javascript actually postprocesses this iteratively updating the DOM which triggers re-rendering events. I am not savvy to the precise details of vue and browser lifecycle events, but I think effectively it builds a shadow DOM and then triggers a single rendering event to avoid the jumping around. But yeah, a single rendering vs multiple renderings.
Getting the client javascript to render the entire page has a variety of problems, hence my desire to server side render (during a build step) what is essentially static content. If the page size bloats up too much though I might need to reconsider the approach...
The worst part about the MathML debacle is that it actually used to work in Chrome a decade ago or so, but then it got removed because there was no real support for maintaining it within the Chrome team.
Yes, it's been almost eight years since Chrome removed their MathML support. The rationale at the time was unspecified security and performance problems:
A developer who worked on it, David Barton, had this to say in 2015:
> I volunteered for a year and got MathML working in Chrome 24, but Google turned it off in Chrome 25 because I couldn't afford to keep maintaining it for free. (Yes, grumble. Donate your nickels to Google.) There was a security bug, but Google had a fix, which has since been landed in WebKit and the Safari browser, for instance. No one on the Chrome/Blink team cared about MathML, so they preferred removing it to maintaining it. They tell people that a library like MathJax is good enough, but it isn't without native browser support for MathML - it's too slow for many use cases, it doesn't integrate well enough with CSS, etc. (the MathJax team agrees with all this). Presumably as digital textbooks gain in popularity, Google will rethink their position, or schools will have to use a different browser than Chrome (Firefox and Safari would both work). In the meantime, Google has no one working on MathML in Chrome at all, even part-time. Go figure. (And no, I would not work on it again.)
The Chromebook team at Google also removed the Japanese-input Dvorak layout on Chromebooks. I updated one day and the option just didn't exist any more - and I know I wasn't the only user. It's very frustrating, and I had to use some X11 shim commands to get around it until I ditched the Chromebook.
MathJax does not support this because, IIRC, it runs layout calculations in the browser whereas KaTeX passes it off to CSS.
If your argument is then that layout calculations should _also_ happen on the server then... I'm not sold and that would be a critique of web browsers and not math rendering.
It sounds like they want PDF. It’s rendered, not some image. But it’s also consistently laid out.
I think that’s a neat idea in general.
When I make little toy games for the web the part I hate the most is the boilerplate for ensuring every browser, mobile and desktop, gets a viewport of the same ratio. Would be neat to be able to say “give me a 16x9 viewport, and scale everything inside of it depending on how large that ends up actually being on the screen so that the same amount of content is seen by every user.”
> ...they use client-side rendering for static content. In my opinion, this is absurd.
Rendering equations client-side doesn't seem absurd to me. All of HTML and CSS and SVG is rendered client-side, why shouldn't equations be too?
I'm completely unclear why the author thinks equations specifically should be rendered server-side. Do they think HTML pages should be delivered as prerendered images or SVG's too...? Because it seems like the philosophy would apply the same way.
You’re misunderstanding what “client-side rendering” means. It’s not about rendering to images, it’s about rendering to HTML/CSS/SVG/whatever.
“Client-side rendering” means that what is served contains something not in the eventual form you desire to present it, and that it depends on some client-side scripting to convert it into the desired form. When the page loads, it will first show perhaps nothing or perhaps something like $e^{-\frac{x}{2}}$, and then the scripting will kick in and replace that with the proper HTML/CSS/SVG/whatever markup for the equation that the browser knows how to display.
“Server-side rendering” means that you do this translation from $e^{-\frac{x}{2}}$ to the desired HTML/CSS/SVG/whatever that the browser knows how to display, on the server, so that the browser can immediately display what it receives. Ideally this translation is also only done once, rather than on every request.
I know it's rendering to HTML/SVG whatever, but I still think that belongs client-side.
It's better to do it on the client because it's generally better to send less information over the wire, as well as send information in its most unprocessed form.
This applies, for example, to charts as well -- far better to send the raw data series values to the client and render it on the client-side with a charting library.
Both of these cases have the huge advantage that if you want to use the equation or data yourself, you can just "view source" or "inspect" and grab the equation or data directly to copy.
I don't think we need to be too concerned about the fact there's a slight delay in rendering equations. People are pretty used to browsers loading everything gradually, whether waiting for images to be the last thing to appear, or even lazy loading of images. There's nothing wrong with that.
But does this actually work well for layout? i.e. if you generate HTML/CSS output from math formulas server-side, will it correctly (from the math typography perspective) reflow if the viewport is resized etc?
I don’t know whether MathJax/KaTeX do anything special about wrapping. They could hard-code the lot, they could use markup and styles that will leave the browser to handle wrapping, or they could do active reflow where they adjust the DOM depending on the available space. But server-side rendering is still feasible and desirable even in that last case: you render to some baseline markup that will work in all cases but perhaps not have ideal layout, and then add a bit of optional JavaScript to reflow and optimise the layout. Ideally you separate the mathematical markup rendering from the reflow completely, so that you can ship much less JavaScript (more targeted and faster JavaScript, at that).
Yeah that is my take as well, and I'm rather perplexed at what the author would like server-side rendering to output given they don't like PNG and SVG either. HTML and CSS weren't designed with math in mind, and while you can force them to do the job (like KaTeX), the result is sub-optimal. It looks fine, but it requires you to transmit orders of magnitude more data, which isn't very suitable to accessibility software (well the HTML/CSS part that is - it also includes a MathML representation which is more screen-reader friendly).
I definitely agree that is crummy for pages to display the markup first, then swap it out with rendered content later. Does anyone know why common libraries do it this way? Web development isn't my strong point, but I thought it was pretty easy to make the browser run some javascript before it renders the page.
The state of maths on the web, as someone who uses a lot of maths on their website[1] is to recognise that loading times on the web shouldn't exist, and you should just turn your formulae into images that you load using `loading="lazy"`. And of course, to make sure they fit any resolution: generate SVG images.
And no, MathML is irrelevant: you don't care about MathML, and your users don't care about MathML: all you care about is that users can read your formulae, and all your users care about is that they see decent-looking maths. As much as I like the idea of MathML, there is simply no reason to ever use it. Nothing is mining the web for maths, and semantic markup for maths buys you nothing.
You have a build system (because your content is generated from markdown or the like. No one who wants to deploy a real site writes pure HTML in 2021), make that generate SVG images by literally just running LaTeX during your build, to replace all your maths with SVG <img> code instead. Because why would you even bother with MathJax or KaTeX, they put the burden on your users, which is ridiculous: you're building your content already, just build static content for your formulae)
And sure, does your site have maybe 10 formulae? By all means, use MathJax or KaTeX. But if it relies on maths, generate your graphics offline using actual LaTeX (and this is trivial using github actions[3]) and use <img> elements that point to those SVG images.
(I run my maths through xelatex, then losslessly convert the resulting PDF to SVG by first cropping the PDF, then running pdf2svg[2]. Is that a lot of work? No, it is not. It's a one-time setup and it simply runs whenever content gets updated. It's about as no-effort as it gets)
What would you use it for? And that's a serious question: what would you use it for, as opposed to just using wolfram alpha or some other service that can already get you all the answers, analysis, and more, without having to mine MathML from random pages on the internet?
In the past, I used a math search engine[0] to find solutions for Olympiad problems, especially inequality ones. I imagine it would be useful when you want to find the name of some formulas or expressions that you came across, though probably not much more.
Semantic MathML is an absurdity, like marking up the tree diagram of all your English sentences, and linking all words to a URL with their dictionary definition. In short, the sort of think only the semweb wonks could have dreamed up.
I like this! It's similar to what I did when turning formulae of an ebook from gif to something prettier and more editable. The whole book only had a dozen formulae, so manual work actually covered everything. I used the codecogs editor - https://www.codecogs.com/latex/eqneditor.php - which can emit svg and png. I got it working fine, with svg fallback, in GitHub markdown and epub.
Accessibility? Being able to copy/paste the formulae into formula editor or solver? Being able to easily style the formula (including for dark themes)? ...
When was the last time you actually wanted to do that, rather than just wanting to hypothetically raise that possibility for the sake of an argument about web technology?
Because in reality, based on my experience at least, no one actually needs that. Folks can copy a formula that they got from an image into wolfram alpha just fine. And the folks who can't don't actually benefit from MathML: they benefit from the JS Selection and Range functions, when site owners take the time to make sure that text-selection of a formula image leads to a LaTeX formula being put in the clipboard, instead.
"That's way more work" but since we're all using build systems anyway: no it's not. Write once, thousands if not hundreds of thousands of users benefit. The end.
I’ve come to the same conclusion, especially since I often end up using the same equations for both web and LaTeX documents, so it’s nice to have the same rendering engine everywhere, especially once you start making small tweaks to spacing etc.
One issue I’ve run into is that it’s not always that easy to get the style of the SVG images right, especially when it comes to sizing & placement. Did you find a good way to e.g. ensure that inline equations have the correct baseline alignment? Or a good method to ensure that equation sizes always match the font size of the paragraph they’re in?
I try not to ever use inline equations. Text is text, code is code, maths is maths, keeping it scoped to their own blocks tends to work better for readers.
> you should just turn your formulae into images that you load using `loading="lazy"`. And of course, to make sure they fit any resolution: generate SVG images.
One drawback to this approach is that SVG equations have a fixed layout, so e.g. they can't automatically line-wrap. Most of the equations on your Bézier page are pretty short, but I notice that you have elected to manually wrap them in some spots when they get too long, or in other cases just let them spill off to the side. This is most apparent with a narrow viewport (e.g. a phone), but you can also see this by using the responsive design mode on a desktop browser or even simply resizing the viewport. The longer equations get cut off on the right side and you have to scroll horizontally to see the whole thing, which isn't ideal. One of the benefits HTML is supposed to have over just, say, a PDF is the ability for the same document to reflow to different viewports.
The Bézier page illustrates another common issue with SVG images of equations: they have a lot of text in them, but none of it is searchable text. That means no Ctrl-F and no search engine indexing of that text. This is fixable via SVG, though, since SVG images can provide a searchable text layer. (I don't mean to single out your website, by the way, this is an issue with math all over the web. Also, selectable text is a longstanding bug in Cairo[0], which pdf2svg relies on to generate SVGs, so it's not an easy fix on your end anyway.)
> there is simply no reason to ever use it. Nothing is mining the web for maths, and semantic markup for maths buys you nothing.
MathML supports automatic linebreaking of equations. SVG doesn't. That's one simple reason to use MathML. I'm not sure whether this fits your definition of "semantic markup" or not, but it is useful. Linebreaking even has its own section in the MathML spec.[1]
It's also not true that nothing is mining the web for MathML. SearchOnMath[2], for example, indexes pages from the NIST DLMF[3], which uses MathML extensively.
Firefox has supported MathML for a very long time. And I think chrome had an experimental version it for a while then removed it. I wouldn't hold my breath waiting for universal browser support.
Igalia picked up the Chromium development, they actually have it in an amazingly good state at this point. This image compares rendering from Chromium with their patches, Safari, and Firefox (in that order) https://mathml.igalia.com/img/mathml-example-gamma.png
That being said I wouldn't hold my breath it'll ship this year either since there is still some work in LayoutNG to be completed at minimum but maybe 2022 we'll all be amazed it finally happened.
KaTeX supports server-side Node rendering pretty easily. https://katex.org/docs/node.html
Maybe MathJax does too, but I haven't checked.
I use this on https://vcvrack.com/manual/DSP to parse $inline$ and $$block$$ TeX into MathML with a couple lines of server-side code (simplified for readability):
I have use MathML with Latex2mathml python library https://github.com/roniemartinez/latex2mathml. And is great, the fact that you can render math without a single line of java script. But at the end, only Firefox fully supports MathML, without Chrome support is kind of useless.
> they use client-side rendering for static content. In my opinion, this is absurd
I'm not a mathematician but what's the problem with client side rendering - I just about only write equations in jupyter notebooks, the syntax isn't great but it seems to work perfectly and scales to the resolution required. Math rendering is far from the most complex thing being rendered on the front end.
It tends to mess with anchor links, as the browser will scroll to the appropriate heading which gets pushed down as rendered content takes up vertical space.
Why do we except a server to render maths for us? The client renders HTML/SVG. Maths is text, we should take the same approach. There just needs to be a compact markdown for maths notation. MathML is an awful mess.
I've had a similar argument for years. Before there was CSS to do rounded corners on boxes, it was typical to do a hack with JS building up the rounded corners pixel by pixel. On the project I was working on, since all the rounded boxes were generated by the same component, I eliminated the client-side JS for a statically-generated set of output that did the same thing. It rendered noticeably faster. And I was overridden by the other team members who apparently felt it was better to manage this through having every single client render the rounded corners on their underpowered computers (this was for a company intranet and from the discussions about system requirements, most of the machines were running old versions of Windows on old PCs) instead of doing it once and for all on the server.
I use server rendered SVG with mathjax as part of my static site generator. To try to make it as accessible as possible I add a title element with an ID to each SVG and use an aria-labelledby attribute to connect the two [1] (a sample for the interested, scroll about half way down [2]). The title content is the unrendered LaTeX source.
I'm very interested in the notion of using HTML and CSS rendering though! Many thanks to the author for pointing out this functionality.
I still render Markdown on the client. My blog is probably "slow" by 2000s standards, but it's far faster and more lightweight than a typical news website in 2020.
Every page involves the browser doing rendering work. HTML is neither fish nor fowl; it's not particularly easy to write by hand, but it's not particularly easy for the browser to parse either. I'm not convinced there are any good use cases for it, and certainly not for CSS.
If partial prerendering on the server-side meaningfully speeds up your site for end users then by all means do it. But I would follow "make it work, then make it work right, then make it work fast". Dropping in a single tag to do client-side rendering is more than enough for the 90% case.
Wow, your page goes through about five rendering forms on first load. First it appears in plain monospaced text, then it gets rendered to unstyled HTML, then one stylesheet loads that mostly just makes the text bigger (strapdown.min.css), and then another stylesheet loads that handles the rest of the layout and vanishes the header (readable.min.css), and then finally the header appears again as the Raleway font loads. That’s pretty major content- and layout-shifting, and not at all pleasant to behold. And all subsequent pages go through at least two forms (plain monospaced text, then fully rendered).
> Every page involves the browser doing rendering work.
Well yeah, but HTML, CSS and JavaScript are different beasts.
The browser can optimise HTML and CSS in various fascinating ways to provide a smooth experience and cope with loading problems in generally-useful ways; your site, if the JavaScript execution was effectively done in advance, would load faster, and skip at least the first three forms, going straight to fully-rendered-except-for-the-header or fully-rendered, depending on how quickly Raleway arrives. Subsequent page loads would go straight to the correct rendering.
JavaScript, on the other hand, is the most likely to not work, perhaps because it was disabled (including most spiders—client-side rendering is still decidedly bad for SEO, even if Google specifically has mitigated most of that for Google search inclusion), perhaps because it failed to load due to network conditions, perhaps because the browser is old or some such thing. Depending on JavaScript does make a site much less reliable. Sometimes that’s warranted, but I don’t think it is on regular content websites, ever.
By using JavaScript in the way you have done, you’ve guaranteed that it will render badly to begin with, that stylesheet loading will not be done smoothly, and that the site is less reliable and accessible. It’s… probably not a big deal, I begrudgingly acknowledge, but it does matter. It may still be faster than most news sites, but you hardly set a high bar there.
Heh, I used to use an explicit `display:hidden` to hide everything until it was actually rendered, but HN people complained about not being able to see anything with JavaScript turned off.
Thank you for removing it—as a disable-JavaScript-by-default person, I do prefer to see it as plain text than to see nothing.
A better compromise here is to hide it by default only when JavaScript is present: e.g. add style="display:none" to the root element in an inline script before any external scripts, and then remove that when you’re done, or onerror, or after a few seconds have elapsed. You’ll still run into the stylesheet loading issues, though—if you want to make it work as smoothly as the browser, you’ll need various onload/onerror handlers on <link> elements and the likes. Browsers do a lot of heavy lifting like this that is impossible, hard or fiddly to implement in JavaScript.
Another compromise that some have used is a timed animation so that if the code hasn’t finished loading and executing within a certain time frame, what’s visible is rendered anyway. Google’s AMP used this approach, and made a serious mess of it in its first release, hiding stuff for up to 8 seconds. This approach is fairly uniformly inferior to the previous paragraph’s approach.
We already have two standardized ways to write math equations: MathML and TeX-like notation. The former is the formal standard, the second is what is overwhelmingly used.
It would be great if we could have both natively in the browser, and maybe support for math equations as a system level feature of editors everywhere. But the vomiting-green face-woman emoji gets more money than math equations these days
This wasn’t very in-depth at all if I’m being honest.
I recently used KaTeX for server-side equation rendering with node. I think people tend to just use mathjax because it’s the more popular solution for web-based equations, but after spending a week trying to render server-side with mathjax and failing, I used KaTeX and haven’t looked back.
Ha! I used to get pretty pissed-off when I was doing my online math homework, only to find that you couldn't copy/paste the equations into Google. I always thought it was a prevention mechanism, but I guess it was a matter of the technology not being ready yet.
This article is odd. I use pre-rendered KaTeX with remark [1], plus their stylesheet on the client side. All of the article seems more specific to MathJax than the title seems to convey.
The author treats SVG as equivalent to PNG and complains that
> Images are impossible to use with copy/paste
You can select and copy text in an SVG[0]
> Images are not nearly as responsive, and are difficult to style. Line breaking, fonts, and even colors are difficult to change when using images
This is partially true, but it's not difficult to inherit your page's text colour for SVGs.
> Images are completely opaque to users in need of screen readers
Not true of SVG's with text - and in fact SVGs can have alt text which could in some cases by much more accessible to screen reader users that the raw equations[1].
To make it relatively fast, the TeX engine gets snapshotted and shipped to the browser with much of TeXlive already loaded. So even things like TikZ work reasonably well. There is of course a lot more to do! The plan is to convert ximera.osu.edu to this new backend by the fall.