Java代码审计之DocumentBuilder-XXE调用链完整分析过程

Java代码审计之DocumentBuilder-XXE调用链完整分析过程

0x01 调试分析过程

Payload

package com.DemoXXE.Demo1DocumentBuilder;

import org.w3c.dom.Document;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import java.io.ByteArrayInputStream;
import java.io.InputStream;

public class DocumentXXE {
    public static void main(String[] args) throws  Exception {
        String str = "<!DOCTYPE doc [ \n" +
                "<!ENTITY xxe SYSTEM \"http://127.0.0.1:8000\">\n" +
                "]><doc>&xxe;</doc>";
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

//        //禁用DTDs (doctypes),几乎可以防御所有xml实体攻击
//        dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); //首选
//
//        //如果不能禁用DTDs,可以使用下两项,必须两项同时存在
//        dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);        //防止外部实体POC
//        dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);   //防止参数实体POC

        DocumentBuilder db = dbf.newDocumentBuilder();
        InputStream is = new ByteArrayInputStream(str.getBytes());
        Document doc = db.parse(is);
    }
}

下面的链都是进入解析方法的过程,并不重要

parse:799, XML11Configuration (com.sun.org.apache.xerces.internal.parsers)
parse:771, XML11Configuration (com.sun.org.apache.xerces.internal.parsers)
parse:141, XMLParser (com.sun.org.apache.xerces.internal.parsers)
parse:243, DOMParser (com.sun.org.apache.xerces.internal.parsers)
parse:339, DocumentBuilderImpl (com.sun.org.apache.xerces.internal.jaxp)
parse:121, DocumentBuilder (javax.xml.parsers)
main:18, DocumentXXE (com.DemoXXE.Demo1DocumentBuilder)

image-20211031151415617

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/javax/xml/parsers/DocumentBuilder.java

DocumentBuilder#parse
进入到parse(in)

image-20211031151528824

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/jaxp/DocumentBuilderImpl.java

DocumentBuilderImpl#parse
进入到domParser.parse(is);

image-20211031151544930

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/parsers/DOMParser.java

DOMParser#parse

进入到parse(xmlInputSource)

image-20211031151629800

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/parsers/XMLParser.java

XMLParser#parse

进入到fConfiguration.parse(inputSource);

image-20211031151718172

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/parsers/XML11Configuration.java

XML11Configuration#parse

进入到parse(true)

image-20211031151804278

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/parsers/XML11Configuration.java

XML11Configuration#parse

进入到fCurrentScanner.scanDocument(complete);

image-20211031151900929

接下来才是真正的开始解析xml代码了

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLDocumentFragmentScannerImpl.java

XMLDocumentFragmentScannerImpl#scanDocument

进入到int event = next()

image-20211031152006043

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLDocumentScannerImpl.java

XMLDocumentScannerImpl#next

进入到fDriver.next();

image-20211031152055411

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLDocumentScannerImpl.java

XMLDocumentScannerImpl#next

注意:这里将状态设置为了SCANNER_STATE_PROLOG,后面会用到

进入到fEntityScanner.skipString(xmlDecl)

image-20211031152143841

通过定义知道xmlDecl变量是字符数组<?xml

static final char [] xmlDecl = {'<','?','x','m','l'};

image-20211031152239916

进入到fEntityScanner.skipString(xmlDecl)

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLEntityScanner.java

XMLEntityScanner#skipString

判断我们传递进去的xml字符串里,前五个字符串是不是等于<?xml

因为我们传递进去的xml字符串里前五个字符串是<!DOC

所以不相等,进入到了return false里。

image-20211031152535363

image-20211031152713578

回到了

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLDocumentScannerImpl.java

XMLDocumentScannerImpl#next

返回return XMLEvent.START_DOCUMENT

image-20211031152819947

定义里 START_DOCUMENT=7

image-20211031152842711

回到了

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLDocumentFragmentScannerImpl.java

XMLDocumentFragmentScannerImpl#scanDocument

则event=7,即event = START_DOCUMENT
于是进入到了红框里的case

image-20211031153006495

接下来重新进入到next()里

image-20211031153035900

进入到了

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLDocumentScannerImpl.java

XMLDocumentScannerImpl#next

因为之前将状态设置为了SCANNER_STATE_PROLOG

所以进入到了红框里的代码块

image-20211031153228936

进入fEntityScanner.skipChar(‘<’, null)

image-20211031153344274

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLEntityScanner.java

XMLEntityScanner#skipChar

其实就是在比较我们传递进去的字符串的第一个字符是不是 <
是的话就返回true

image-20211031153520629

所以将状态设置为了SCANNER_STATE_START_OF_MARKUP

image-20211031153634227

于是进入到了SCANNER_STATE_START_OF_MARKUP代码块里

image-20211031153734312

判断接下去的字符是不是!-

因为接下去的字符是!D,所以不符合。

image-20211031153844108

于是进入到了else if (fEntityScanner.skipString(DOCTYPE))

因为DOCTYPE是private static final char [] DOCTYPE = {'D','O','C','T','Y','P','E'};

image-20211031154000898

而我们传递的字符串里,<!后面的字符串是DOCTYPE,所以符合条件。

于是将状态设置为SCANNER_STATE_DOCTYPE

image-20211031154044667

接着进入到SCANNER_STATE_DOCTYPE的代码块里

image-20211031154329252

后面的就是循环的去解析我们传递进去的xml代码

最后当event=1的时候,进入到next里

image-20211031161524859

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLDocumentFragmentScannerImpl.java

XMLDocumentFragmentScannerImpl#next

匹配到了Payload里的&后,将状态设置为SCANNER_STATE_REFERENCE

image-20211031161610089

进入到scanEntityReference(fContentBuffer);

image-20211031162056227

到了

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLDocumentFragmentScannerImpl.java

XMLDocumentFragmentScannerImpl#scanEntityReference

获取到了实体名称 xxe

image-20211031154815328

接着进入到fEntityManager.startEntity(true, name, false);

image-20211031155007355

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLEntityManager.java

XMLEntityManager#startEntity

获取到了实体的值

image-20211031155251498

进入到了

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLEntityManager.java

XMLEntityManager#startEntity(boolean isGE, String name, XMLInputSource xmlInputSource, boolean literal, boolean isExternal)

然后进入到setupCurrentEntity方法里。

image-20211031155537415

注意:xxe漏洞最后都会进入到setupCurrentEntity方法里

Java/JavaVirtualMachines/corretto-1.8.0_292/Contents/Home/src.zip!/com/sun/org/apache/xerces/internal/impl/XMLEntityManager.java

XMLEntityManager#setupCurrentEntity

这里请求了Payload的外部URL地址

image-20211031155637521

完整的调用链如下:

setupCurrentEntity:620, XMLEntityManager (com.sun.org.apache.xerces.internal.impl)
startEntity:1304, XMLEntityManager (com.sun.org.apache.xerces.internal.impl)
startEntity:1240, XMLEntityManager (com.sun.org.apache.xerces.internal.impl)
scanEntityReference:1908, XMLDocumentFragmentScannerImpl (com.sun.org.apache.xerces.internal.impl)
next:3061, XMLDocumentFragmentScannerImpl$FragmentContentDriver (com.sun.org.apache.xerces.internal.impl)
next:602, XMLDocumentScannerImpl (com.sun.org.apache.xerces.internal.impl)
scanDocument:505, XMLDocumentFragmentScannerImpl (com.sun.org.apache.xerces.internal.impl)
parse:842, XML11Configuration (com.sun.org.apache.xerces.internal.parsers)
parse:771, XML11Configuration (com.sun.org.apache.xerces.internal.parsers)
parse:141, XMLParser (com.sun.org.apache.xerces.internal.parsers)
parse:243, DOMParser (com.sun.org.apache.xerces.internal.parsers)
parse:339, DocumentBuilderImpl (com.sun.org.apache.xerces.internal.jaxp)
parse:121, DocumentBuilder (javax.xml.parsers)
main:18, DocumentXXE (com.DemoXXE.Demo1DocumentBuilder)

0x02 总结

通过上面的总结,默认情况下用 Unmarshaller 来处理xml不会发生xxe的问题(必须是jdk1.8,如果jdk是1.6和1.7,则也存在反序列化漏洞)。我们可以看到调用栈的过程中,存在xxe问题的库或者类实际上最后底层调用都是jdk自身处理xml的类,最后的核心触发流程都会来到 XMLEntityManager#setupCurrentEntity 当中

0x03 参考链接

http://www.lmxspace.com/2019/10/31/Java-XXE-%E6%80%BB%E7%BB%93

   转载规则


《Java代码审计之DocumentBuilder-XXE调用链完整分析过程》 ske 采用 知识共享署名 4.0 国际许可协议 进行许可。