Thankfully, the balls we need for MIO48/49 are routed out - B12 goes to R2447 and C12 to R2445. I forget which is Rx and which is Tx 🙂
I've gotten a lot out of PCBite probes in the past (https://www.amazon.com/dp/B08W3RM861) - they're basically little weighted pogo pins you can use to hit pesky tiny test points. Otherwise just lots of flux! I also like to use some hot glue to tack down and strain relief fragile joints like that.
On to your questions...
1) JTAG comes up disabled, and is enabled by the bootrom after it copies a stub to high memory. I think some ES chips had it enabled by default, but no production silicon. I think I cover that a little bit here: https://blog.ropcha.in/part-1-zynq-glitching.html - if you only saw my gists, that site should have a great deal more info. The UART scripts can only dump the bootrom, but there are some exploits I detail that can take control while the bootrom is executing and *then* you can debug/breakpoint/whatever to your hearts' content. The SD DMA one is probably easiest to reproduce especially if you're not into soldering an entire TSOP's worth of bodge wires.
I used an antminer board for my experimentation....I didn't find it locked down at all. Then again, I also have two working bootrom exploits, so it is impossible for them to dodge those!
I also took up that challenge to "no user access"! I wanted to see if there were security flaws as well, because it's kind of fun to get to know that every single one of those chips is forever broken to a couple tricks. I've my analysis databases floating around somewhere in both IDA Pro (very $$) and Binary Ninja ($, but it seems like the freebie supports ARMv7) formats if you'd like a peek.
Some time ago I picked up a used Metcal Curie effect soldering station, it feels like cheating. But my cheapo Amazon-sourced hot air gun has stuck around for years and been very handy. Hard to beat that price, and I imagine it'll come with fine enough nozzles to target resistors!
... domain referencing an exploitation technique: https://en.wikipedia.org/wiki/Return-oriented_programming I had forgotten that "ropcha" alone was a real town, having found that out some years after buying the domain...
I also have a little fiberglass pen (useful for removing soldermask) that I imagine would promote adhesion well, but have not tried it for that.
For some broad strokes: the binary is based at 0, reset jumps to 0x44 and main is 0x1688, code is basically constant til 0xb648, then there is a chunk from 0xfc80 to 0xff10 (this is the stub copied to high memory before locking away the rom). One undocumented tidbit is that RAM gets mapped to 0x70000 during bootrom execution.
I'd really encourage looking at Binary Ninja (https://binary.ninja/free/) it's a great tool. I've attached my analysis database (it's actually mostly an imported IDA database, but that's irrelevant...
Ghidra (https://ghidra-sre.org/) is also extremely powerful and entirely free. I never picked it up as I've always had binja/IDA access. It seems quite popular in for reversing weird embedded machines, the NSA clearly needed a tool that was very good at that!
I have a 64 bit Ultrascale that's been worked on intermittently over the last ~2 years though, but I have not dumped its secrets yet. I think it's vulnerable to the ONFI parsing issue... maybe it's time to pick it up again.
I know why you're not finding the string, but I can keep my mouth shut if you'd like. Most of the mrc/mcr stuff is just clearing caches and predictors iirc. it's followed by setting up of (largely identity-mapped) page tables (probably apparent in the rom as a huuuuge table of repeated similar-but-different values passed by pointer to some mcr). There's this IDA plugin that I found useful: https://github.com/NeatMonster/AMIE and maybe you can borrow some of their work.
You're correct about the 2 more boot modes. One is a (broken, I believe - I asked Xilinx about this on the phone at some point) boot-from-fabric mode that attempts to boot from one of the AXI busses. The other is...I think another JTAG mode? I forget.
The former doesn't work because the ARM cores are responsible for loading the FPGA configuration (it's actually not done in the bootrom at all and the FSBL is solely responsible for this). You'd have to have a stub loader in some other boot media that comes up, blasts the bitstream into the FPGA, and then change boot modes externally and reboot into AXI. You can't set sticky bits to warm reset into a mode different from the current one either (I think the ZU+ supports this). It's clunky at absolute best.
You'll want to keep an eye on leaf function calls too btw, the compiler really likes doing goofy stuff like pop lr before a leaf call and then branching with 'b' rather than 'bl' for example. Skimming your .txt it seems like some of those subroutines were missed. A lot of the code also looks like it came right from 2012-era Xilinx HAL sources (you can poke through the github archives to find them). This is especially handy analyzing the more complex NAND/SD boot modes.
Be aware there's some kind of gnarly RSA bignum code in there somewhere!
(I ask, why all the data barriers, I don't think they are really needed) Me too. I think it's just how the Xilinx SDK works (they have write32/write16/write8 etc wrappers). There shouldn't be any real need for it with the page tables set to no cache and only a single core running ...I think...
boot_mode[2:0] = {mio4, mio3, mio5}
This might be why you're not seeing the UART prompt.
JTAG doesn't work in the bootrom at all. Well, ok - the chain might show up but the ARM DAP isn't going to respond at all. The little 512B stub in high memory enables it at 'str r2, [r3]' after the bootrom is locked out at first thing.
I think the FAT code is based on the following, but it should also be in the Xilinx 'embeddedsw' repo.
I did have some luck exploiting the ROM and then setting up register state and jumping back near the start. Some functions don't like running twice, so iirc I just jumped to where it selected a boot mode (I was developing the SD exploit, so I'd boot to NAND and exploit *that*, then jump to probably 0x1b70 after setting up a handful of registers. I think I give enough information to plug-and-play the SD card exploit, and part of that payload enables JTAG. This should be a little more detailed than the blogpost, should you want to go that route at any point:
Tom's Computer Info / tom@mmto.org