PDAL on the VDI: pipelines

If you've jumped straight to this lesson and PDAL doesn't appear to exist on your VDI session, please head to the setup and test lesson and then come back.

What is a PDAL pipeline?

In the same way you can pipe outputs from one command to another in a bash shell prompt, you can pipe data from one PDAL function to another using the pipeline application

This can happen on the command line:

$ pdal pipeline --readers.las.filename=input.las --writers.laz.filename=output.laz

... but is more commonly executed by feeding a JSON configuration file to PDAL, for example:

{
    "pipeline": [
        {
            "type": "readers.las",
            "filename":"input.las"
        },
        {
            "type": "writers.laz",
            "filename":"output.laz"
        }
    ]
}

...and then passing it as a payload:

$ pdal pipeline some_tasks.json

We will deploy and explain this example shortly.

Note - prior to PDAL 1.3.0, pipeline configuration files were declared in XML, and a lot of examples in the wild internet still use XML pipelines. You will need to translate any useful examples to JSON.

Pipeline syntax is a little tricky to get a handle on - the basic flow, however is:

{
    "pipeline": [
        {
            operation1parameter1,
            operation1parameter2
        },
        {
            operation2parameter1,
            operation2parameter2
        },
    ]
}

You'll see this as we progress through the rest of the PDAL tasks - the pipeline is a fundamental application for most PDAL operations. We'll step through a simple example here, and learn more about pipelines as we attack other tasks in this workshop.

You will also see that PDAL understands a lot of shorthand - which we demonstrate below.

Data translation with a pipeline

Open a new text file, name it 'las2laz.json' and paste in the following JSON:

{
    "pipeline": [
           {
                "type": "readers.las",
                "filename":"./g/data/rr1/Elevation/Merimbula0313/z55/2013/Mass_Points/LAS/AHD/LAS/Tiles_2k_2k/Merimbula2013-C3-AHD_7605910_55_0002_0002.las"},
        {
            "type": "writers.laz",
            "filename":"/Merimbula2013-C3-AHD_7605910_55_0002_0002.laz"
    ]
}

In the same directory, type:

$ pdal pipeline las2laz.json

...after a wait, the command prompt will return and you should be able to see the new .laz file you've just created. it should be substantially smaller - compare it's size on disk with the source .las file:

$ du -h ./Merimbula2013-C3-AHD_7605910_55_0002_0002.laz
$ du -h ./g/data/rr1/Elevation/Merimbula0313/z55/2013/Mass_Points/LAS/AHD/LAS/Tiles_2k_2k/Merimbula2013-C3-AHD_7605910_55_0002_0002.las

...and use your knowledge of PDAL's metadata capabilities to check the integrity of your new .laz file.

This task can be done as a one-liner in PDAL translate, so why learn pipelines? In the next example you will start to see.

Simple data filter with a pipeline

Great! We can compress a file. What else can we do? Let's extract only points which are classified as buildings, and compress the results. Here, we add a filter stage in between the reading and writing stages:

{
    "pipeline": [
           {
                "type": "readers.las",
                "filename":"/g/data/rr1/Elevation/Merimbula0313/z55/2013/Mass_Points/LAS/AHD/LAS/Tiles_2k_2k/Merimbula2013-C3-AHD_7605910_55_0002_0002.las"
            },
            {
            "limits": "Classification[6:6]",
            "type": "filters.range"
        },
        {
            "type": "writers.laz",
            "filename":"/merimbula_buildings.laz"
            }
    ]
}

Now that's super messy. So we use PDAL's understanding of file formats to do some shorthand:


{
    "pipeline": [
        "/g/data/rr1/Elevation/Merimbula0313/z55/2013/Mass_Points/LAS/AHD/LAS/Tiles_2k_2k/Merimbula2013-C3-AHD_7605910_55_0002_0002.las",
        {
            "limits": "Classification[6:6]",
            "type": "filters.range"
        },
        "./merimbula_buildings.laz"
    ]
}

...which does exactly the same task:

Recognise that our input file is a .las file and read it
Select only building-classified points
Write the selected points out to a compressed .laz file

For a challenge - how might we check whether we actually only have buildings left in our new dataset?

More reading: http://www.pdal.io/pipeline.html#pipeline

Back to base

Exercise hints

A simple check to see if a conversion between file formats has worked is to look at the total number of points - try '$pdal info --summary | grep count' for both the original and compressed files.
Checking that our buildings-only file actuallu contains only buildings classes is trickier - a quick visualisation is the current best approach, so we will leave that task for a little while