Tables in pure Markdown

2 Likes

This discussion will soon be four years old, it would be lovely if at some point something was standardized :slight_smile:

4 Likes

A pipe table syntax with lots of features:

|              | Header 1        | Header 2                       || Header 3                       ||
|              | Subheader 1     | Subheader 2.1  | Subheader 2.2  | Subheader 3.1  | Subheader 3.2  |
|==============|-----------------|----------------|----------------|----------------|----------------|
| Row Header 1 | 3row, 3col span                                 ||| Colspan only                   ||
| Row Header 2 |       ^                                         ||| Rowspan only   | Cell           |
| Row Header 3 |       ^                                         |||       ^        | Cell           |
| Row Header 4 |  Row            |  Each cell     |:   Centered   :| Right-aligned :|: Left-aligned  |
:              :  with multiple  :  has room for  :   multi-line   :    multi-line  :  multi-line    :
:              :  lines.         :  more text.    :      text.     :         text.  :  text.         :
|--------------|-----------------|----------------|----------------|----------------|----------------|
[Caption Text]

  1. Multiple rows of headers and subheaders (Thank you vas)

  2. Row headers, which are indicated by replacing dashes - with equals signs = in the first column’s delimiter row (Thank you again vas)

  3. Row spans using a carat/circumflex ^ (Thank you jgm)

  4. Column spans using multiple pipes ||| (MultiMarkdown, Maruku)

  5. Caption surrounded by brackets [ ] on the line just below the table (MultiMarkdown)

  6. Multi-line cell continuation using a colon : in place of a pipe | (like in PostreSQL’s interactive terminal, as discussed by David Wheeler in RFC: A Simple Markdown Table Format and suggested above by illionas)

  7. Per-cell alignment using colon(s) : inside a cell, to the left/right/both of the cell’s first line of text (inspired by the |:---:| syntax for per-column alignment)


An alternative syntax, for better compatibility with existing pipe tables:

| Caption Text |                 |                |                |                |                |
|--------------|-----------------|----------------|----------------|----------------|----------------|
|              | Header 1        | Header 2       |        <       | Header 3       |        <       |
|              | Subheader 1     | Subheader 2.1  | Subheader 2.2  | Subheader 3.1  | Subheader 3.2  |
|==============|-----------------|----------------|----------------|----------------|----------------|
| Row Header 1 | 3row, 3col span |       <        |        <       | Colspan only   |        <       |
| Row Header 2 |       ^         |       <        |        <       | Rowspan only   | Cell           |
| Row Header 3 |       ^         |       <        |        <       |       ^        | Cell           |
| Row Header 4 |  Row            |  Each cell     |:   Centered   :| Right-aligned :|: Left-aligned  |
|.            .|. with multiple .|. has room for .|.  multi-line  .|.   multi-line .|. multi-line   .|
|.            .|. lines.        .|. more text.   .|.     text.    .|.        text. .|. text.        .|

A future extension could implement this syntax.

For now, in GFM and other existing Markdown flavors, it falls back to an ordinary table whose cells contain symbols that visually represent the additional features:

  1. One row in the rendered table consists of cells containing only dashes - and cells containing only equals signs =. This resembles a “delimiter row”. Cells above this row represent headers and subheaders. Cells below that row represent other table cells.

  2. In the “delimiter row”, if the first cell contains only equals signs =, it indicates that the cells in the first column represent row headers.

  3. A row span is represented by cells containing only a carat/circumflex ^.

  4. A column span is represented by cells containing only a “less-than” symbol <. (Inspired in part by 0x666C697473’s above proposal to use a “greater-than” symbol > for column spans.)

  5. A caption is represented by text in the top-left cell (above all other cells, headers, etc.)

  6. Multi-line cell continuation is represented by row(s) whose cells each start and end with a single dot that is whitespace-separated from the cell’s text content. |. (text) .| Visually, the dots in each column resemble a vertical ellipsis which indicates that the above cell continues downward.

  7. Per-cell text alignment is indicated if a cell (or, in a multiline cell, the first line) starts and/or ends with a colon that is whitespace-separated from the cell’s text content. | (text) :|, |: (text) |, |: (text) :|

In this table, the Markdown text can be compressed to remove extra dashes, equals signs, and whitespace, as long as there remains whitespace next to colons |: :| for alignment and next to dots |. .| for cell continuation.

EDIT: I have revised the above syntax proposal to simplify it. The earlier version was as follows:

|==============|                 |                |                |                |                |
|--------------|-----------------|----------------|----------------|----------------|----------------|
|              | Header 1        | Header 2       |        <       | Header 3       |        <       |
|              | Subheader 1     | Subheader 2.1  | Subheader 2.2  | Subheader 3.1  | Subheader 3.2  |
|==============|=================|================|================|================|================|
| Row Header 1 | 3row, 3col span |       <        |        <       | Colspan only   |        <       |
| Row Header 2 |       ^         |       <        |        <       | Rowspan only   | Cell           |
| Row Header 3 |       ^         |       <        |        <       |       ^        | Cell           |
| Row Header 4 |  Row            |  Each cell     |:   Centered   :| Right-aligned :|: Left-aligned  |
|.            .|. with multiple .|. has room for .|.  multi-line  .|.   multi-line .|. multi-line   .|
|.            .|. lines.        .|. more text.   .|.     text.    .|.        text. .|. text.        .|
|--------------|-----------------|----------------|----------------|----------------|----------------|
| Caption Text |
1 Like

There is a related topic discussing the introduction of figure environments and captions for images: Image tag should expand to figure when used with title It would be nice if a unified caption syntax would emerge from these two topics.

Captions could be added as a table row at the bottom (or perhaps top) of a pipe table. But since an image isn’t made up of rows it wouldn’t make sense for image captions to also use this syntax. However, both transcluded images and CSV table content blocks do share the same caption syntax.

Modifying meg’s proposal, here’s a table syntax based on key-value pairs:

title: The Title | name: The Name | ph: The Phone
-|-|-
title: value 1
name:  value 2
ph:    value 3
||
title: value 4
name:  value 5
ph:    value 6

An alternative, using headers as keys:

The Title | The Name | The Phone
-|-|-
The Title: value 1
The Name:  value 2
The Phone: value 3
||
The Title: value 4
The Name:  value 5
The Phone: value 6

A future extension could convert the above code into this table:

GFM-table-2

In current GitHub-Flavored Markdown, it falls back to a table with keys/values in the first column:

@aoudad: This syntax violates the Prime Directive. It’s worth reading the discussion in that thread.

Even though it’s spelled out both by Gruber when he introduced Markdown and by @jgm in the introduction to CommonMark, a lot of ideas and proposals on this forum lose sight of it, and it confuses efforts to solidify and advance this standard. Maybe create a topic titled “The Philosophy and Spirit of Markdown” or “The Markdown Prime Directive”, and pin it to the top of the forum? @jgm, @codinghorror, what do you think? I’d be happy to make the initial post (I’ve been drafting something about this), though it might be best if it came from John. I realize, John, that you’ve already done this in What Is Markdown? Maybe just post and pin that at the top of the forum?

I think it important that the philosophy and spirit stay in the forefront of everyone’s minds as we try to get to v1.0 as well to v1.1 or v2. Any new directions that ditch the original philosophy are fine, but they shouldn’t be called Markdown.

3 Likes

For better readability, the table could be written as follows:

| The Title | The Name | The Phone |
|-----------|----------|-----------|
| The Title: value 1               |
| The Name:  value 2               |
| The Phone: value 3               |
| ||                               |
| The Title: value 4               |
| The Name:  value 5               |
| The Phone: value 6               |

These additional pipes and dashes make it look more table-like. And as before, it falls back to a valid table in existing GitHub-Flavored Markdown.


The question is, does readability take absolute priority over everything else? If so, “Prime Directive” would seem to be an appropriate metaphor.

I would argue, though, that sometimes readability should be weighed against other considerations. For example, compatibility with existing Markdown flavors is essential to CommonMark’s mission of specifying Markdown. Also, CommonMark needs to respect the Principle of Uniformity (i.e. text content has the same meaning whether or not it is inside a container block), since the spec for lists and block quotes presupposes this principle.

In any case, I like your idea for a topic about the philosophy and spirit of Markdown. This is becoming a longer discussion, and I think it’s worthy of its own place on the forum.

2 Likes

I really like this.
My idea is improvement for more machine- and human-readable.

Simple One Rule: Always start from a pipe character (|) for machine-readability.
This rule would avoid conflicts with other syntax.

|######################################## Caption Text ##########################################
|_______________________________________________________________________________________________,
|              | Header 1      || Header 2                     || Header 3                      |
|              | Subheader 1   | Subheader 2.1 | Subheader 2.2 |  Subheader 3.1 | Subheader 3.2 |
|==============|---------------|---------------|---------------|----------------|---------------|
| Row Header 1 ||| 3row, 3col span                             || Colspan only                  |
|______________|                                               |________________|_______________|
| Row Header 2 |^                                              |  Rowspan only  | Cell          |
|______________|                                               |                |_______________|
| Row Header 3 |^                                              |^               | Cell          |
|______________|_______________________________________________|________________|_______________|
| Row Header 4 | Row           | Each cell     |:   Centered  :| Right-aligned :|: Left-aligned |
|~             | with multiple | has room for  |   multi-line  |    multi-line  |  multi-line   |
|~             | lines.        | more text.    |      text.    |         text.  |  text.        |
|______________|_______________|_______________|_______________|________________|_______________/

For human-readability, rule lines are sometimes useful. For machines, however, these have no mean.
So lines stating from |_ can be introduced, which can be ignored like comment lines.

Let’s remove lines starting from |_.

|######################################## Caption Text ##########################################
|              | Header 1      || Header 2                     || Header 3                      |
|              | Subheader 1   | Subheader 2.1 | Subheader 2.2 |  Subheader 3.1 | Subheader 3.2 |
|==============|---------------|---------------|---------------|----------------|---------------|
| Row Header 1 ||| 3row, 3col span                             || Colspan only                  |
| Row Header 2 |^                                              |  Rowspan only  | Cell          |
| Row Header 3 |^                                              |^               | Cell          |
| Row Header 4 | Row           | Each cell     |:   Centered  :| Right-aligned :|: Left-aligned |
|~             | with multiple | has room for  |   multi-line  |    multi-line  |  multi-line   |
|~             | lines.        | more text.    |      text.    |         text.  |  text.        |

I think that it is better to use double or more pipe character BEFORE a table cell.
It makes parser a little easier for the colspan attribute creation.
Also, it allows us omit the last pipe character.

|######################################## Caption Text ##########################################
|              | Header 1      || Header 2                     || Header 3
|              | Subheader 1   | Subheader 2.1 | Subheader 2.2 |  Subheader 3.1 | Subheader 3.2
|==============|---------------|---------------|---------------|----------------|---------------
| Row Header 1 ||| 3row, 3col span                             || Colspan only
| Row Header 2 |^                                              |  Rowspan only  | Cell
| Row Header 3 |^                                              |^               | Cell
| Row Header 4 | Row           | Each cell     |:   Centered  :| Right-aligned :|: Left-aligned
|~             | with multiple | has room for  |   multi-line  |    multi-line  |  multi-line
|~             | lines.        | more text.    |      text.    |         text.  |  text.

Lines starting from |# constitute a caption text.
For this example, I write a caption text before a table because a <caption> element should be the first child of <table> element, but it’s not important.
Like lines for h1, h2, …, enclosing text with # should be allowed but its count does not matter.
The simplest form is |# Caption Text.

Keyword |^ increases rowspan, but no space should be allowed between | and ^ to simplify parser.
This no space rule would be also useful for other keywords.

To describe a row with multiple lines, keyword |~ can be used at the first of subsequent lines, instead of : use as a column separator.
Optionally |~ can be used not only at the first but also at each separator, but the first |~ is required.

Finally, I think that table syntax can be a extension of CommonMark, but I will be happy if it is released as a formal specification!

1 Like