# CWYAlpha

Just another WordPress.com site

## Thought this was cool: Coding Horror: The Future of Markdown

### October 25, 2012

Markdown is a simple little humane markup language based on time-tested plain text conventions from the last 40 years of computing.

Meaning, if you enter this… …you get this!
Lightweight Markup Languages
============================
According to **Wikipedia**:
> A [lightweight markup language](http://is.gd/gns)
is a markup language with a simple syntax, designed
to be easy for a human to enter with a simple text
editor, and easy to read in its raw form.
Some examples are:
* Markdown
* Textile
* BBCode
* Wikipedia
Markup should also extend to _code_:
10 PRINT "I ROCK AT BASIC!"
20 GOTO 10


## Lightweight Markup Languages

According to Wikipedia:

A lightweight markup language is a markup language with a simple syntax, designed to be easy for a human to enter with a simple text editor, and easy to read in its raw form.

Some examples are:

• Markdown
• Textile
• BBCode
• Wikipedia

Markup should also extend to code:

10 PRINT "I ROCK AT BASIC!"
20 GOTO 10


You can think of Markdown as a radically simplified and far more human readable form of HTML. I have grown to love Markdown over the last few years. If you’re a programmer of any shape, size, or color, you can’t really avoid using Markdown, as it’s central to both GitHub and Stack Overflow. For that matter, my new project uses Markdown, too.

Markdown is a wonderful tool, but it does suffer a bit from lack of project leadership. The so-called “spec” is anything but, and there are dozens of different flavors of Markdown out there, all with differences in the way they behave. While they are broadly compatible, Stack Overflow and GitHub have both tweaked Markdown in ways that can trip you up if you’re familiar with one but not the other; compare GitHub Flavor with Stack Overflow Flavor.

That’s why I was so excited to get this email from David Greenspan a few days ago:

I’m the creator of EtherPad (a collaborative WYSIWYG editor), now working at Meteor. At Meteor, we’re trying to “pave the web” for developers by writing better components. For example, we just released universal login buttons that talk over WebSockets and are wired into the users table of the app’s database. Since Markdown is increasingly ubiquitous for writing content, it’s going to be part of the Meteor toolchain. I wouldn’t be surprised if we end up releasing a component like Stack Overflow’s editor, with the full “Meteor” standard of code quality, so that no one has to roll their own again. Today, we use Markdown in our API docs generation, and we’re going to be writing more and more content in it — which is a scary thought.
I think you and I share some concern (horror?) about Markdown’s lack of spec and tests. The code is ugly to boot. Extending or customizing Markdown is tricky (we already have some hacks and they are terrible), and I worry about “bit rot” of content if the format doesn’t have a spec. I’m evaluating the possibility of starting over with a new implementation coupled with a real spec and test suite, and I’ve been thinking a lot about how to parse a language like Markdown in a principled way. I’m pretty fearless about parsers, by the way; I wrote a full ECMAScript parser in a week as a side project.
I want this new language – working name “Rockdown” – to be seen as Markdown with a spec, and therefore only deviate from Markdown’s behavior in unobtrusive ways. It should basically be a replacement that paves over the problems and ambiguities in Markdown. I’m trying to draw a line between what behavior is important to preserve and what behavior isn’t.

I was excited because, like David, I freaking love Markdown. I love it so much that I want to see it succeed and flourish over the next 20 years. I believe the best way to achive that goal is for the most popular sites using Markdown to band together and take ownership of Markdown as a standard. I propose that Stack Exchange, GitHub, Meteor, Reddit, and any other company with lots of traffic and a strategic investment in Markdown, all work together to come up with an official Markdown specification, and standard test suites to validate Markdown implementations. We’ve all been working at cross purposes for too long, accidentally fragmenting Markdown while popularizing it.

Like any dutiful and well-meaning suitor, we first need to ask permission for this courtship from the parents. So I’m asking you, John Gruber: as the original creator of Markdown, will you bless this endeavor? Also, as a totally unreleated aside, have I mentioned what a huge Yankees fan I am? Derek Jeter is one of the all-time greats.

I realize that the devil is in the details, but for the most part what I want to see in a Markdown Standard is this:

A standardization of the existing core Markdown conventions, as documented by John Gruber, in a formal language specification.
Make the three most common real world usage “gotchas” in Markdown choices with saner defaults: intra-word emphasis (off), auto-hyperlinking (on), automatic return-based linebreaks (on).
A formal set of tests anyone can use to validate a Markdown implementation.
Some cleanup and tweaks for ambiguous edge cases that exist in Markdown due to the lack of a formal specification.
A registry of known flavor variants, with some possible future lobbying to potentially add only the most widely and strongly supported variants (I am thinking of the GitHub style code blocks which are quite nice) to future versions of Markdown.

And that’s it, really. I don’t want to extend Markdown by adding tons of crazy new functionality, or radically change the way it currently works, or anything like that. I’d be opposed to such changes. I just want to solidify and standardize the simple, useful version of Markdown that is working so well for everyone right now. I want there to be an unambiguous, basic standard that everyone using Markdown can expect to work in the same way across all web sites in the world when they begin typing.

I’d really prefer not to fork the language; I’d much rather collectively help carry the banner of Markdown forward into the future, with the blessing of John Gruber and in collaboration with other popular sites that use Markdown.

So … who’s with me?

Posted by Jeff Atwood

I don’t know Markdown fully, but reading the example, there was the text “_code_” which I though and hoped would parse into an underlined text, but it didn’t – it was italicized.

Would it be anyhow possible to make “_text_” parse into an underlined text? I think that’s what two underscores represent very well.

I would strongly recommend a hard look as asciidoc (http://www.methods.co.nz/asciidoc/) as a mature (10 year+) and extensible text markup format — and as a mature bunch of code to process it.

If you leave out most of the features, you can make it look like Markdown, too 🙂

I’m reminded of the guy who decides that there should be one standard
because there are n divergent implementations. So he goes and writes
his own. Now there are n+1 divergent implementations.

Of course, I understand that he’s talking about a process to go along
with a blessed and convergent implementation, but by throwing out there
the goal of creating an implementation as the goal of this
standardization process I think may be premature. It’s so easy for
people to get dragged into the idea of getting something done that
everyone simply moves in their own direction again and the convergence
doesn’t happen.

I’m not saying that’s what he’s calling for, but I’d be careful. Rather
than an implementation, I think what’s needed is the will of the major
consumers (sounds like you’re on board, so that’s a good start), a good
sense of compromise, a willingness to recognize the desirable features
which have evolved to address actual shortcomings of John’s original
spec, and a discipline about preventing feeping creaturism.

Most of all, however, I would say that there needs to be a concrete and
formal grammar. This should be the goal and distillation of all of that
process. Tests yes, of course, hand in hand with it, but a formal
grammar which eliminates all ambiguity in the language (and therefore
in the hopefully many standards-compliant implementations).

I would propose formalizing the language in a [Parser Expression
Grammar]. There’s great tooling available (even in js), PEGs are very
comprehensible, and in fact, it’s already been done more than once
already. What’s lacking is a blessed PEG and implementations of the
same spec in multiple languages.

I can’t help with any of those things, 🙂 but I can help with a couple
technical observations.

– In my book, [kramdown] is the current best-of-breed. A spec needn’t
be quite so ambitious, but I find support for element attributes and
basic table syntax to be essential.
– [Pandoc] has the tightest and most complete implementation, albeit in
Haskell. A good start would be to lift the PEG from it. There is one
other PEG floating around, but I couldn’t name it off my head and it’s
not as rich as Pandoc’s.

[Pandoc]: http://johnmacfarlane.net/pandoc/
[kramdown]: http://kramdown.rubyforge.org/
[Parser Expression Grammar]: http://pegjs.majda.cz/

Interstingly enough, Markdown standardisation topic just took off today at Markdown discussion list. It is happening here: http://markdown.github.com

I personally believe that writing raw Markdown markup is not for everyone. That’s why we are building a true WYSIWYM editor for Markdown. http://www.texts.io/

I don’t think tables are a good fit for markdown. I’ve worked with Pandoc and Asciidoc as well as most wiki variants like Creole and Mediawiki. Using text to describe a table structure horribly sucks. If you want people to make tables, allow the basic HTML needed for them to do it.

I would really love to see a single unified implementation that could just become a part of popular languages. If you use Ruby, Python, PHP, Perl or whatever else you have in your toolbox, there’s libraries to handle markdown with many behaving differently in subtle ways (like the gotchas you list).

A simple testable implementation would encourage those languages to just natively support it, giving Markdown even more adoption.

By the way, if you use Markdown for lots of stuff (leanpub, project documentation, etc) I really recommend seeking out and installing Markdownpad. It’s like Notepad, but uses Markdown with a preview that can be easily customized by anyone that knows simple CSS. Just associate it with the ‘.md’ file type and off you go. I think it prints nicely too, but I’m not sure, I haven’t owned a printer in over a decade.

I’m in Jeff!
We run a large business to consumer site that is open to public and needs to be secure.

Markdow is perfect for injection prevention and most of all, repurposing of content (ie applying different style interpretations, text versions of content and publising to html, pdf or other formats).

I’ve done dozens of CMS projects and am totally over the quirks of html based editors, the security issues they can cause and the inability to repurpose the generated content.

We currently use markdown although it can be a little tricky to get clients into it. If we had time we’d be doing this ourselves, but I’d love to be in the process.

Looking forward to a unified Markdown standard with a good wysiwyg editor view.

Hi Jeff,

Referring to “Wikipedia” as a lightweight markup language is both a misnomer and slightly inaccurate.

The correct terminology for the markup language that Wikipedia supports is wikitext. Wikipedia is one of numerous sites that use MediaWiki as a platform. MediaWiki is capable of parsing wikitext.

The crazy thing is that “lightweight” markup languages are only lightweight in terms of ease of use. Parsing SGML is context-free, whereas wikitext and markdown are both context-sensitive, and therefore more complex to write parsers for.

I agree with you a lot Jeff, you can’t really avoid Markdown. Although I’m hesitant to agree that it needs standardization.

You say it’s a humane markup language, well if it’s meant to be human-friendly then that doesn’t exactly lend itself to standardization. Different people have different preferences. I’m no expert by any means, but that’s just my initial reaction.

For any beginners reading this, you might want to check out my introduction to Markdown that I wrote on my blog:

http://codeconquest.com/learn-markdown-youll-thank-yourself-later/

The owners of a spec aren’t its creators. It’s the users. You don’t need Gruber’s stamp any more than you need mine.

It’s easy to get confused. The creators of open source projects are often intimately involved with the interests of the users. But it’s not the case here. The active stakeholders seem aligned. That’s all you need. Is there any real dissent besides Gruber’s inactivity?

(Users include both sides of the fence – people implementing the translation and people writing in Markdown)

I tried to love Markdown and Textile, but as a web developer it just doesn’t make sense to write articles with it. I’m often confused about the syntax for writing links and images, and if you want to write about HTML you’ve got to escape your HTML snippets.

For this reason I wrote my own text formatter, it’s basically HTML with a lighter syntax, and like so called “lightweight” markup languages paragraph tags are implied.

http://nbsp.io/development/doccy-a-mid-weight-markup-language

That article mentions Symphony, a CMS framework, but it’s not tied to that at all, the source code is at:

https://github.com/rowan-lewis/doccy

It’s also written in a more sophisticated way that your usual text formatter; by parsing the input and building the output using an XML DOM, you’re guaranteed that the output is sane.

People who are suggesting alternatives or saying that Markdown is too difficult are missing the point.

We ARE going to use Markdown. We like it.

I think it’s a great idea to standardize Markdown. My site uses it for our user submitted content and I think standardizing it is just what the doctor ordered.

A better spec is sorely needed. There have been calls for this for years on the markdown-discuss mailing list, but never any uptake. I wish you luck in persuading John Gruber, who has been resistant even to requests for informal clarifications of the spec. Here is his most recent contribution to the markdown-discuss list, after a long period of absence:

http://www.mail-archive.com/markdown-discuss@six.pairlist.net/msg02703.html

In any case, I am willing to help out. I have written three markdown implementations: pandoc (in Haskell, using parser combinators, http://johnmacfarlane.net/pandoc), peg-markdown (in C, using a PEG grammar, https://github.com/jgm/peg-markdown), and lunamark (in lua, using lpeg, https://github.com/jgm/lunamark). So I know quite a bit about parsing markdown, and particularly about what would have to be settled in a more determinate spec.

To start, here is a list of some big (not edge-case) questions that the current syntax description leaves open:

The list includes links to a tool which will show you the output of a bunch of different implementations, so you can see how they differ. Further up in the FAQ you’ll find a longer list of divergences between various implementations (including lots of bugs).

Fyodor —

Bless you! I was going to post that the world is in desperate need of a WYSIWYG Markdown editor, but I see you have that covered.

Thanks —

I’m in, with what I can.

I wrote https://github.com/trentm/python-markdown2
It is basically a straight port of Gruber’s Markdown.pl — i.e. it is regex based. Plus, like most processors, it adds a number of extras/extensions.

If this effort bears fruit, perhaps the most useful part of my implementation would be the test suite that I use: https://github.com/trentm/python-markdown2/tree/master/test/tm-cases

Some of those tests are specific to markdown2 (e.g. those tagged with “extra” in the .tags files). If helpful I could easily separate out those tests that are for core Markdown functionality to a separate repo.

The other “*-cases” dirs in https://github.com/trentm/python-markdown2/tree/master/test are copies of (likely old versions of) test suites from other Markdown projects.

Good luck,
Trent Mick (@trentmick, github.com/trentm)

I understand why you made the announcement, but part of me thinks you would have been better off to work with all of your partners in private to produce something quickly. People are already trying to tell you how this needs to be done (including me, I guess). The only thing worse than work produced by a committee is when the committee solicits input from the general public.

I do like that you are starting with Gruber’s basic description. If I were running this, I would limit the scope of the initial release to what Gruber describes as much as possible. The smallest possible spec with reference implementation should be all you need for v1.0.

Good luck. You’re probably going to piss off as many people as you please with what you eventually release.

I feel dirty pimping my project, but I think it’s relevant enough and sets out one angle of my interest in Markdown: I created a Chrome/Firefox/Thunderbird extension called Markdown Here (MDH) that lets you write your email in Markdown and then render it before sending. Check it out: https://github.com/adam-p/markdown-here — I wrote it because I wanted it to exist, and it’s actually pretty sweet.

I’m very ambivalent about the prospect of a new Markdown spec.

In favour of a new spec:

• Experience with one MD dialect is irritatingly non-transferable.
• I work on projects on both Github and Bitbucket. I’m pretty comfortable with GFM at this point, but I struggle to figure out Bitbucket’s dialect (seems to have undocumented backtick-fences? but not syntax name? unlike GFM there’s no clear description of it?).
• Even dialects have dialects.
• In MDH I use the JS renderer Marked (https://github.com/chjj/marked) — specifically in GFM mode (it was the best GFM-supporting JS lib I could find). But even it doesn’t exactly implement GFM’s mods (line breaks, tables).

• I have trouble believing that any single spec can encompass enough of the MD extensions to “win”. And I think winning is probably necessary, or else the “n+1 specs” objection is compelling.
• Maybe some kind of extensibility can be built into the spec? Cool, but complex. (Standardized flags indicating what table format to use? Editor symbols to quickly show users what table format is available?)
• Would such extensibility really gain us enough/anything over what we have now? (I don’t mean to be glib with that question. I think it might gain us a lot.)
• Tangent: If MD starts getting used a lot for extracted code docs, there’s going to be a push for Javadoc-ish extensions.

With the help of a user I, uh, added TeX math formula support to MDH. Like so: “$-b \pm \sqrt{b^2 – 4ac} \over 2a$”. So I’ve done my part to dirty the dialect waters. I don’t feel good about this.

I think that one of the issues is that there’s a tension between “MD as primarily markup” and “MD as primarily plaintext”. Two examples of this, from GFM:

Fenced code blocks. Indented blocks of code look pretty nice when reading plaintext. Fenced code doesn’t look as nice, and even less so when you specify the language. But writing/pasting 4-space indented code is more of a hassle, and it’s not clear how to specify the language.GFM line breaks. Gruber’s original spec basically discarded single linebreaks — this allowed MD writers to maintain 80-char lines without breaking flow when rendering. In contrast, GFM interprets a single linebreak as a <br>. I don’t like this GFM change, but I also don’t like the original spec’s “two spaces at the end of the line to get a <br>”.

Those examples might seem pretty minor, but the more we extend MD, the less plaintext-readable it becomes. Probably. (I don’t mean for that statement to be shrill. I love many of the extensions. And maybe it’s okay that extensions get somewhat less plaintext-readable if the base stays clean. Artificially constraining MD’s growth sure won’t work, anyway.)

Ile: I think the reason _this_ is shown in italics rather than underlined is that underlined text looks like a hyperlink. Underlining has been suggested and rejected on Stack Overflow.

Hey Jeff,

I think a standardized grammar and (executable) specs would be a great thing. Having everyone rally around the common goal, bringing consistency to Markdown processing and representation across web/desktop/mobile applications would be really nice.

The challenge is that Markdown can be used in a number of contexts. Perhaps a small safe subset for blog comments, vs. writing a book with something like Leanpub, where they support a variety of Kramdown extensions.

Given the variety of extensions, the scenario sounds a lot like trying to become the W3C of Markdown implementations. A big effort, but I suspect a really good outcome. Thanks for taking this on. 🙂

Nathan.

“automatic return-based linebreaks (on)” — Please, no. Markdown already includes syntax for lists and code blocks. There are very few other occasions where you need a hard line break. Currently markdown works both for people who like to hard-wrap their text and for people who don’t. It’s best to keep it that way. The proposed change would radically change how most existing markdown documents are rendered. And why? Because some new users are surprised that hard breaks are treated as spaces? The same users are surprised when indented paragraphs are treated as code. No matter what the rules are, some users will be surprised by them.

Saying it’s a choice with a default doesn’t help. That just fragments markdown into many variants, so that you can’t be sure that markdown that works fine on one site will render the same on another. It would be best to reduce fragmentation rather than fostering it.

This is exactly what I wanted to do few months ago. Sadly, I was all alone and my coding skills weren’t so good that I can write Markdown parser even w/o thinking about it. 🙂

So here’s what I would like to see:

* Solid documentation (how that all works, edge-cases),
* Code that makes cry tears of joy (very easy to read, later-on – to extend or port),
* Default behaviour must be XSS safe (not as it is in PHP port of Markdown)!

P.S. In what language it will be implemeneted?

reStructuredText has already been mentioned. It has a single mature definition.

Personally, I had no opinion on which was better, till I needed to mark up a poem in Markdown. How does Markdown represent text with
explicit
line breaks
(like this)? Why, with invisible white space! That was a poor design decision.

Unfortunately, I think that reStructuredText is set to be the Betamax of lightweight markup languages to Markdown’s VHS: technically superior, but eventually eclipsed.

The one feature I truly miss in some Markdown implementations is the ability to specify code block language. For example on Github, I can do:

“javascript
“

And the code block will be syntax highlighted for Javascript.

Oops, can’t edit comment. Just wanted to add that Posterous also had that feature but the syntax was different.

#!javascript

I really wish there was only one way to do it all over the place.

after reading on of the first comments I digress that I wish /this/ was italic rather than _this_ but I know that won’t come out of **this**

“Bless you! I was going to post that the world is in desperate need of a WYSIWYG Markdown editor, but I see you have that covered.”

Tim Post already mentioned it, but I want to call it out: http://www.markdownpad.com/

I use it virtually every day. Actively developed too: it auto-updates itself quite regularly.

I never knew what I was using on reddit until I read this article. I’m definitely a fan and hope to see it in some of my favourite technologies like Meteor and StackOverflow.

Just to continue what @DaGrevis said… I would love for the default implementation to NOT accept HTML by default, but for whatever tags being used to render in plain text. This is because in most cases, you don’t want a random user typing in a [script] tag or anything like [a onclick=”evil code”]

And then there’s Reddit flavor. You forgot about _the_ most important one!

An answer both to Lilleyt and for the post, regarding technical issues of implementing it in JavaScript.

I personally tried to implement PegJS-based parser for Markdown (see links below). However, PegJS-generated result looks totally huge, about 10+MB, partly because my version of parser is for sure not finished, not perfect and there are a lot of ways to optimize, but partly because PegJS by David Majda renders rules in a plain way – no operator-functions or something like that. The last fact affects speed in a good way, but it also affects parser size in a very bad way. So, while I haven’t finished Markdown parser, about a year ago I’ve started to tune up (refactor) PegJS implementation to have a function for each operator and to improve scoping and stuff. This parser-generator may in result parse a bit slower than original, but will weight much-much less number of bytes: it is totally in minimalistic style, not-used operators are excluded, and so on. But I am still in this, writing a code in small portions, still seeing a good end, but still in progress.

So, here are two facts I have for now:
– It is hard to represent some complex rules of Markdown in PEG, like blockquote-in-list-followed-by-block-of-code, but is achievable, since there is a googd enough implementation in C++ by Ali Rantakari (however, it also fails in some complex variations).
– Current version of PegJS is not a very good match for it, at least for now (or may be I am very wrong in a way I am impementing a parser). It will be almost impossible to include the parser in mobile applications and so on.

My version of PegJS parser for Markdown, in progress: https://github.com/shamansir/mdown-parse-pegjs
My customized version of PegJS, inteded to produce very compact parsers using the powers of functional code, in progress: https://github.com/shamansir/pegjs
C++/PEG GUI-oriented implementation of Markdown parser: http://hasseg.org/peg-markdown-highlight/
Useful links on parsing Markdown: https://github.com/shamansir/mdown-parse-pegjs#sources
MDTest to test your Markdown parse on compatibility with spec: http://git.michelf.com/mdtest/

A thoughts regarding parsing Markdown in general and its improvements.

Markdown became a geeky-language, easier version of LaTeX, reduced in functionality in favor of speed of writing. However, geeks term is not equal to mean programmers only, but also it’s about designers and even literature authors. As a result, language spec may not to require all this programming-language-marks & s.o. in plain version, it should may be detect language by itself using similar-to-SO approach. Or it should not have special syntax for it, but use HTML-comments to mark a language, since they are supported everywhere (I know that there is a lack of copying code with 4 spaces before, but I think it is easily-resolvable in any modern code editor and it breaks markdown-compatibility if doing it other way than John recommended at start).

And I agree, there is a huge “want” to include tasty features in Markdown, but there should be a very strict selection of such features, because it is very hard to make all of them still look lovely, so someone (like John) and only him should say “I said so and it’ll be”. Or, including features should be based on votes. Or, even better, the plugin-like system may save us all, if there will be a central plugin repository (say, “Markdown Flavors”), with parsers/PEG for every programming language, and it will be as easy to include one as including script tag or head-file in your document. And, of course, there should be a central distribution site, where all tests will run every second for a main implementation itself and CDN for every parser/plugin and so on…

BTW, Mou is the best Markdown-editor for Mac OS for me, it parses almost all of the list-in-list-in-blockquote problems)

This sounds great and I’m really glad to see Pandoc is already in this discussion. I’m working in educational technology and also with researchers in academia and the Pandoc is really something I’ve been recommend to all researchers frustrated with Word and other WYSIWYG word processors. Having tables and footnotes makes it ideal markup for researchers.

I’d just like to point out that this probably isn’t the best idea…
http://xkcd.com/927/

IF you can get Markdown’s parents to agree and mention you in their page, you have a chance. Otherwise, as others have pointed out, you’re just the 15th standard where only 14 existed before.

Also, if you do try to do this thing, I’d advise making several Markdown “profiles”. The “basic” profile would cover the current Markdown without any additions (as you’ve said), while the “extended” profile will add all the new features and bells and whistles. I’m in the camp that thinks that the basic Markdown is too limiting and needs more features (like tables, colors, fonts, etc)

There are many different use-cases for a language like Markdown, and trying to make a “one size fits all” solution rarely works.

fully supportive of this idea. Taking it a tiny bit further, I think this is ideal for the Community group of W3C. Couple of other sponsors (not required to be W3C members) and the work can be done jointly there with the potential of taking it further if needed when the spec is done.

http://www.w3.org/community/groups/proposed/#markdown if you would like to see it developed this way.

HTH

I think this is a really important step. What has most concerned me about the markdown clones I have seen springing up is the feature creep. So I have been in favor of a spec to lock down (clarify) the essential features of markdown for a while, as well as a road-map or guidance on how extension should be developed to play nicely with the core features.

I would love to help.

I like Markdown a lot, but for semi-complex to complex work I have to use ReStructuredText, despite what I consider a few flaws.

What I think is an important win for ReST is its extensibility, a feature absent from Markdown.

Your initiative would certainly, IMHO, be more beneficial if the resulting standard includes an extensibility mechanism with features on par with those of ReST.

I prefer reStructuredText too, but unfortunately Markdown “won” the simple markup market.

You know, the thing that stuck out the most for me was the Yankees symbol. I respect you a little more (growing up in the Bronx, being a Yankees fan for over 30 years and going to 20+ games a year).

That said, blame RChern for the gravatar, as I lost a bet with her comparing our favorite teams respective playoff perfomance.

What I think is really of value are the test cases. Any resulting implementation is nice, but for me it would be only a reference implementation.

If just stackExchange and grit hub agreed on a “standard” AND STOPPED USING their current in-house system it would be a good start.

So a converter for all the content that has already been created must be part of the effort.

In my opinion, Markdown has outgrown it’s original idea of “simpler HTML” and is becoming a universal document format. To serve this purpose it needs more than just a spec for the file format. It needs a good simple object model for text that can become a basis for implementations of parsers and formatters in several different languages.

Most of the existing implementations work one way, i.e. from Markdown to HTML and don’t have explicit model for text inside. Those who do, don’t preserve document structure on load-save (for example, references are renamed on conversion from Markdown to Markdown).

I’ve been working on a JSON-based model for Markdown for some time. It currently has PEG-based parser, several formatters and a set of language-indepentent test cases. It also integrates with Pandoc and MultiMarkdown to import documents in their formats.

https://github.com/sheremetyev/texts.js

Wow, what timing! I’ve experienced the same problems with MD, especially with regards to syntax confusion and Stack Overflow’s out of date implementation.

As it happens, the Markdown community is working on a central site to coordinate the various Markdown implementations:

https://github.com/markdown

Anyone interested in contributing is invited to connect to GitHub.

I prefer org-mode. Unfortunately, only emacs understands it.

I find it hilarious that right after talking about making markdown “work in the same way across all web sites in the world when they begin typing.” you include an image that almost but doesn’t quite specify how a markdown icon should look (where is the down arrow supposed to reside horizontally and how long should the icon be?).

A formal specification for an extendible, human-readable Markdown-like language is a great idea, but something else is needed first. We need a specification for characterising and classifying text processing systems, including text formatting interfaces, mark-up languages, and the text being represented. A mark-up language may be characterised by its use of semantic mark-up (e.g. emphasis rather than italics), single new lines being treated as spaces (to allow formatting on non-wrapping displays), and other such qualities.

We recently added Markdown as an option in our hosted CMS (YikeSite) in hopes that some of our customers would choose it over the WYSIWYG editor.

You can play with it here: http://www.markdowncms.com

If there was a standardized Markdown, we would implement that for sure.

I wrote the Erlang Markdown interpreter https://github.com/hypernumbers/erlmarkdown

Real-life Markdown almost always requires that something be rendered twice:

* on the server
* in the client preview

The sane way to make this easy is for the javascript version to be the dominant one – and all the servers-side language version to test themselves the same way: if I start from the same markdown do I produce the same whitespace compatible output as the javascript parser.

When I wrote erlmarkdown I first implemented my take on the spec – then realised I needed to switch to tracking a reference javascript version – which didn’t exist/wasn’t maintained/etc, etc

Then I discovered that the whitespace output of the javascript parser was all over the shop, random linebreaks and stuff – and it turned out to be super-hard to backport that into my hand-written, look-ahead parser.

Most of the whitespace difference was just html-farts of no consequences – but the inability to produce whitespace equivalent meant I couldn’t write simple asserts like:

* =?equals(Got, Expected)

It was generally a real pain.

PS I also think that markdown implementations should accept all html and xml tags from an arbritrary implementation-time whitelist.

PPS They need to accept unicode input as well.

I agree on having a specification, and I don’t think you need Gruber’s blessing. In fact, it may be healthiest at this point for the community to diverge from its origins.

Extensions are a must-have in the specification. There are some sites that require source code, and other sites where source code markup makes no sense. If nothing else, extensions allow the core to remain fixed as new use cases pop up, and allow experimental changes that may ultimately migrate to the core. One of the things that has allowed HTML to age gracefully is that the behavior for unknown elements is well-defined, so new elements can be added in a manner that doesn’t break on older browsers.

+1

I’m not a complete power user of markdown – but the idea of continued fragmentation would be really frustrating.

Ugh. I absolutely HATE markdown.

I prefer UBB code instead. It is much easier to remember and use. Not to mention way cleaner and easier to implement.

One of the currently broken things, that seems easilly fixable is the code highlighting.

For example github uses sign posts.
Stack overflow uses indentation (w/o ability to specify language)?
Jekyll…. which uses markdown, doesn’t use markdown for code highlighting but uses liquid templating.

I wish markdown would include sign posted code highlighting with the desired language specified (like in github)

Maybe you could consider one of our new and shiny W3C Community Groups at http://www.w3.org/Community for starting this work?

Liam (W3C staff)

this is awesome!!!!!! i’m on board and will follow your lead in implementing Markdown/Rockdown on the site i’m building.

Your proposal of “return-based linebreaks” shows that you do not understand Markdown.

Markdown is not a markup language. It is “a plain text formatting syntax” (and a software tool). The idea is that you can format text in a way so that it is easy to read in plain plus that it is easy to convert.

Plain text without line breaks is not easy to read.

Hey Jeff, GitHub’s interested. I sent you an email with an introduction to our Head Markdown Guru. Let’s make it happen!

My .02 – I wouldn’t do a “community group” or any such thing. I’d take the Crockford route for this — i.e., create a Web site for the tests and implementation links, guides, etc., and publish the actual spec as a simple Informational RFC.

That would work well especially if Gruber blesses you as benevolent Markdown dictator (or reclaims that role for himself).

YMMV.

I think the reasons for overlooking reStructuredText are overstated.

Markdown is not unavoidable. I’m a fairly experienced programmer (Django core committer, and develop the full stack from SQL through to Javascript), and though I barely know Markdown, I really don’t feel like I’m missing much.

For example, on GitHub: name your README as README.rst and it will be parsed as reStructuredText. The fact that GitHub uses markdown for comments it no big deal really – I use GitHub and hadn’t noticed that it was Markdown, because the amount of markup you need on something like GitHub is actually pretty minimal.

I think that that is Markdown’s sweet spot – really simple stuff, where you don’t actually know that you are using it, and it’s not that important to know the rules.

Once you get beyond that, and actually need a formal spec, I think it is foolish to try to use Markdown, because it wasn’t designed for that. Rather, you want something more stable and predictable, and also extensible, like reStructuredText. reStructuredText is also very popular in the Python world – almost all new documentation uses it (using Sphinx), which is made possible by its extensibility.

reStructuredText also has multiple highly compatible implementations (Python’s docutils and Haskell’s Pandoc, perhaps others).

Written by cwyalpha