|04-17-2019 04:39 AM|
Just thought I'd update this thread, that only a few would be interested in.
I've found myself deep down a rabbit hole with this pcm / ecu programming subject
I've ended up following this forum --->>> pcmhacking.net
I've read the book 'GM Gen III LS series Powertrain control systems'. This was a great read. It's easy to understand and confirmed everything I thought was going on. Highly recommend it as an intro. It's very basic, but still excellent. Even though it's centred around a GM pcm, the principles overall are similar in other makes.
Luckily, I'm just starting a 'boring LS overdone to death' swap, and have the 411 pcm, which happens to be the most popular and thus documented pcm. Good place to start.
Tonight I've started to read 'The Car Hacker's Handbook'. Far more programming and architecture involved than the above book. Comes highly recommended.
Anyone read it?
|08-27-2017 06:49 PM|
I just got back from a few days wheeling in the sticks here.
Wow, just incredible.
I'll have to set aside some good time to read over that.
Now I've got my hands full
|08-24-2017 10:45 PM|
I'll see if I can shed some light on the basics.
When you "compile" code you're actually doing a series of things. First the compiler looks through the source code and identifies key words it knows, then applies the rules of the syntax for that language and figures out how to turn that into assembly for the processor you're using. From there a thing called the linker comes in and figures out all the memory usage you'll need and where to place things and then puts that information into the assembly code the compiler made.
A simple example would be something like this:
#define AtTopDeadCenterPistonOne 0x0; // 0 degrees is TDC for cyl1
bool AtTopDeadCenter = FALSE; // this says "reserve x bits for me that
// will only be viewed as true or false and
// x bits is the default size of the
// processor registers"
// We are assuming here a function that reads the crank position sensor
// exists and it returns degrees of crank
if ( GetCrankPosition( ) == AtTopDeadCenterPistonOne )
AtTopDeadCenter = TRUE;
Would compile into something like this (for ARM processor) (note that ; is used for comments in assembly)
b __GetCrankPosition ; note that __ is used to note linker inserted locations
ldr r1, [PC + 0x30] ; this loads a compiled hard define into register R1
; from location of program counter plus 30 hex
sub r0, r1, r0 ; this subtracts the value in r0 from r1 storing the
; result in r0
bnz 0x400 ; conditional branch if result was not zero to address
; 0x400 (assume 0x400 is the address after the if statement in our function)
ldr r0, 0x01 ; the compiler chose to keep bool AtTopDeadCenter in r0 as its final resting place
; and because our conditional branch was not true, we know the function returned 0
; and thus we should set AtTopDeadCenter to true (aka 0x01 instead of 0x0)
So here you can see that the processor has specific instructions and each does a different thing. b is a non conditional branch (aka put this address in the program counter and execute what ever instruction is at that address).
ldr is load address, r1 is the destination (a register in the processor), the next part is confusing and requires an understanding of pointers. any time  is used it means "de-reference this" aka use this value as an address and load what ever is there. So in this case it takes the value in the program counter, adds 0x30 and then uses that as an address to grab what ever data is at that address and place it into r1. This is done because the #define makes it a hard coded value, so the "data" for that define is stored in program memory, thus we have to go fetch it. Remember that any number can be used as an address or viewed as an instruction, but only certain numbers are valid. If you try to execute an address with bogus numbers in it as an instruction the processor will cause an interrupt that is designed for "shit I dont know what to do that wasnt valid" that interrupt is hard tied to force the program counter to execute what ever instruction is at the address given the the IRQ's location in memory (that's complicated, I can talk about that later if you want).
sub is the subtract command, where first you give it the final resting place of the calculated data, then you give it the register to subtract from and the register that has the value you want to subtract.
bnz is branch not zero, aka "load this value into the program counter if the last thing done didnt set the zero flag in the status bits of the processor" The status bits are a series of bits that get set after any instruction. In the case of the sub, if the result was zero, the zero status bit would get set, else it would not, thus giving us our condition for our "conditional branch".
You can actually force asm code into c or c++ code using compiler directives if you wish, and even call it as if it was a function or just put it anywhere you want, you just have to have the right syntax for said compiler.
Note here also there are many assumptions that are "normal". For example, no memory (dram) was used for data, because everything we needed could be done using registers, and thus the compiler never actually reserved any DRAM space for us. Second is that the return value from the function we called was stored in r0, this is very normal. The return values of functions are usually stored in r0 or the stack if they are large (or multiple registers, the compiler chooses what is best).
Note that also in ARM code the R14 register is reserved for the return address of a function and filled out for us automatically by hardware, and that for single function deep calls no stack is used for the return address, where as many other processors the return address is always placed on the stack before we branch away to said function.
I guess I should also mention the program counter is just a pointer that says "load the address that is the value of the PC and execute it as if it were an instruction, then add how ever many bytes the instruction was in size to the PC and repeat" that's how the processor runs through code. The only time that changes is if something loads the PC, like a branch, a direct write to it using ldr (if allowed), or returning from a function (some processors have instructions that do this specifically, sometimes it has to be done manually).
A note on pointers:
Sometimes understanding pointers can be VERY confusing, let me give an example. They are basically... well... addresses, just like for your house. If you wanted to go talk to Joe who lived at 123 mayberry lane, well then you know the address and Joe is the data at that address.
So say you did something in C like this on a 32 bit processor:
int * ptr;
ptr = 0x500;
data = *ptr;
This means "take the data at address 0x500 and load it into the integer "data""
ptr = &data;
means put the address for where the integer "data" is into ptr.
that can take some time to wrap your head around, but its exactly like how you use an address for any location in the real world.
I personally found that c and c++ made much more sense to me after I understand how the concept of assembly works, because it linked together how the machine was actually doing what I said, and what I was saying. I've found that different people learn and understand differently however, so learn what ever way works best for you.
|08-24-2017 03:00 PM|
I'll find the 8086 manual.
I've had the Arduino mega2560, with a kit of components, for a while now
It's the arduino that triggered the massive curiosity to understand and experiment with the bigger systems.
I haven't looked at it for 3 weeks, because wheeling, maintenance and mods have sucked up all my spare time.
Ahhh, that bad bad feeling of it seeming like a subject is far too foreign to ever be able to understand. It is incredibly humbling hey. Recently my entire mind was seriously all tangled up
trying to understand C language alone, still wil be. Oh the pangs of anxiety test resolve.
If I wasn't very interested in the topics, I would have thrown in the towel many times already. I had to seek some outside comfort and encouragement too, also got teary wondering if I had it in me.
But then, it starts to click as you say, one win at a time, one revelation at a time.
Similar to physical exercise hurting your body, trying to understand something new hurts the brain. The harder more complex the subject the more it hurts.
|08-24-2017 09:49 AM|
A good place to start, though it can be rather confusing, is to read the old original manual to the 8086 processor.
It will explain how a program counter works, how a stack pointer works, how assembly code works for that processor.
My first 2 weeks in a class called "computers as components" were HELL because all we did was read that manual and then have some lab courses where we wrote in assembly code. It took that full 2 weeks for the whole thing to click and make sense and it drove me NUTS. I literally was on the phone with my dad in tears because I thought I'd never understand this, and then one day... it just started to click.
So be patient, keep at it, it will eventually make sense. You might go buy your self something like an arduino basic starter kit and read the manual for the processor they use (for example the atmega 1248P is a good place to start). Play with that for a while, look at the code with a good editor (source insight is a good option, but some people like slick edit or eclipse). It will start to click eventually.
Once you have the basics you can learn how to start hijacking simple older ECM systems using flash chip programmers and the like, and then move into the more complicated things like the newer ECMs.
Oh, and you're more than welcome. I love to enable people because the more people who are able to do something, the better it is for everyone, and the less work it is on those who know it (making it so they can do more!).
|08-24-2017 02:28 AM|
That's very informative, not too long at all. You've answered a lot more in those two posts than hours and hours of me searching the web. Also with regards to controller systems other than auto ECM. Excellent.
I'm going to read over it a couple of more times, look into some aspects of it, think it all through. I'll be back.
Much appreciate the effort you put in. Didn't expect any replies like this. Thanks
|08-23-2017 10:45 AM|
I sadly dont have any links, much of the information comes from college courses and experience.
There are sometimes interesting things that are done in regards to sensing intrusion as you are talking about, but at the same time there is a limit to what can be done. I'll talk about that in a minute.
Analogue vs digital is... a misnomer I'll say. Technically everything is analog, and digital is just way of expressing that we segmented said analog signal into viewing it in a limited "all or nothing" perspective. Reality is that any square wave is truly analog, especially once you start getting high frequency. Square waves are modeled as an infinite sum of different frequency sin waves at different phases, which is required to get a perfect vertical edge. This is based off of what is called superposition principle. Reality is however that you cant have this infinite number of sin waves, and thus all square waves have what is called an "edge rate". The faster a processor is the faster edge rates you need to be able to meet the setup and hold times of said transistors that are being triggered by said square wave. The issue is that the faster your edge rate is the more energy you put into higher frequency sin waves in said square wave which then causes you to have issues with turning your traces into antenna's and issues with reflections in the PCB lines. This is exactly why on some mother boards you will see squiggly lines in the DRAM traces, its called impedance matching and they are trying to make that line (being stupidly long at several inches) work at such a high frequency as they are using. This is where the line between digital and analog gets blurred, because of the analog properties of your digital square wave are not good enough, it wont work digitally, lol.
Sorry that was some really deep level stuff there. A good book on the above subject is called "High Speed Digital Design: A Handbook of Black Magic", be warned, that is masters degree level electrical engineering material.
The reason I had to cover the above first is that its related to your question. Can you put a logic analyzer on any DRAM bus or serial EEPROM traces and see what exactly bits are flying across the bus? Yes... yes you can. However you might need a really expensive one that doesnt add a lot of impedance to the trace lines or you'll screw up the signal. This is how some intrusions are prevented.
Sometimes there are also fuse lines that are built into the chips were after programming and testing is completed they blow the fuses in the chip so you can no longer use the diag bus on said chip to see the internal workings. This means you are SOL unless you can cut open said chip and connect to the inside connections, requiring a clean room and some very precise machining usually. You can still however trace the normal communication lines, but the diag/programming lines (depending on the exact chip) often give you MUCH more ability and access to everything.
There are methods in code to detect tampering like CRC checks (cyclic redundancy check) which is math that can detect even single bit changes in large code chunks, so you have to recalculate that and sneak it in with any changes you make.
Sometimes there are key codes that are required to be passed in over the bus before you can program or read. Sometimes the encryption they are using alone makes it too hard to mess with because you have to decrypt the whole thing first.
Sometimes they use tricks with their memory controllers in what is called paging (virtual memory) where the hardware knows where everything actually is, but the firmware isnt pervy to that, it just requests a "page" and an offset into that page, but where that actually is in the DRAM address space only the hardware knows, and when tracing said DRAM bus you'll be looking at the hardware address, which makes figuring out what firmware was doing much harder.
Layers of obfuscation like that can make reverse engineering this stuff kind of hard.
Most of the time though ECU's are designed to be cheap, and thus dont have those things, but that is changing. The N55 control computer in the BMW world for example took MUCH longer to figure out how to crack to program than the N54 did, and the first attempts at it required drilling a hole in the PCB to remove a trace to the EEPROM so you could program it. They later figured out how to do it without resorting to that.
There are different classifications of responses to threat detection as well. The vast majority of them simply cause the device to stop working. In the military world they might send a notification if possible or save off info that its been tampered with, or more extreme, destroy all data on said device and render it unusable. The next extreme is self destruction with actual force, and the last extreme is killing the person trying to tamper with it, usually with explosives (yes some devices do this, but they are VERY rare and ONLY exist for the military).
As far as multi processor communication, that completely depends on the design. Sometimes they are allowed to share the same DRAM, sometimes they are not, sometimes they each run their own different code and dont even talk, sometimes they have a shared internal RAM space or register space to talk to each other, sometimes they even share instructions and split them between who has time to process said instruction.
Most commonly they have a small shared register or RAM space where they can talk directly, and then are allowed to share DRAM space on the hardware level, but there may be further levels of restriction on that in the code. This is called semaphore control, and is very important (you cant have someone else come in change what you're working on while you wernt looking at it and hose you over). In C++ this is most commonly done with ease with classes, but in C semaphore control is left to your own design.
Further, in most common cases each processor is running its own "thread" in what is called an RTOS (real time operating system). These are sometimes off the shelf things people sell, or they are home brew designs. Sometimes they are polling driven, sometimes they are interrupt driven. A "normal" RTOS would be one that gives each thread a certain amount of execution time and then makes it give it up mid function to change over to another thread on the same processor (and the kernel keeps track of all that and gives each thread a processor and an amount of time) which means that semaphore control between threads is VERY important.
Usually timing based RTOS is controlled with counter timer interrupts and if a thread has exceeded its execution time the program counter of the processor is changed to force you to execute another thread, while saving the "context structure" of the current thread by storing its information on the stack. Each thread usually has its own stack. (note I'm simplifying things). Usually the kernel also keeps track of its own additional information for each thread, where was the stack pointer pointing? What were all the register values for the CPU at context switch (the changing of thread I was referring too), etc.
Some kernels will only force context switch on events, such as an external interrupt that came in and needs a higher priority service. So threads also have priorities and if a higher priority thread needs to do something, it takes over processor control and executes while the lower priority guy has to wait, then its returned back to who ever was interrupted when the higher priority guy is finished.
Those are excellent points of attack, because if you can sneak your own function pointer into the context switch rather than the "thread" the kernel thought it was going to, BOOM you just took control. But there are many ways to program in protections against this, and that's exactly the kind of stuff the hacking world gets into the gritty details of to get around.
Sorry, I've been told talking to me can be like drinking from a fire hose sometimes, so I'll stop there and you can ask more questions if you want.
|08-22-2017 09:16 PM|
Thanks for the effort in your reply. It's definitely appreciated.
I'm just at the beginning of understanding exactly what goes on, so any hints, and direction is stimulating me on.
You've answered many questions, but also triggered some more.
It's not surprising that they'd make an effort to protect their software from easy manipulation, and having multi processors involved sure adds complexity (communicating to each other digitally, not analogue?), which probably then effects my next thought, which I haven't given a lot of yet.
Do you think the software engineers build-in sensing to know when all the analogue and digital data signal is being tapped and recorded (via all the individual wires)? And if they have, what would the program initiate, if anything?
Do you have any good links you could share for more information at the deeper tech end of ECM programming topic? I'm finding it hard to get anything real decent.
|08-22-2017 08:26 AM|
I'm a BSEE who works in firmware, that said I can give a few pointers but nothing on any specific ECM.
I know that many ECM's today are going multi processor, which makes things a lot harder, however every system will have its code stored in a flash for the most part (so it can be updated). Usually this is an EEPROM, but for cost its also sometimes some kind of NAND or NOR flash.
You can usually pull the code out of the flash device on the PCB using an off the shelf programmer or sometimes through the OBDII port if you can figure out the commands needed. If the engineers did their job well the code is compressed and/or encrypted on the flash part making it one step harder to get to the code.
Next, you're right, it will be in "machine code" but translating that into usable assembly code requires you to know what exact processor they are using. Sometimes that's easy because its a single chip on the PCB and you can figure it out from the silk screen on the chip or board, but many times companies use custom SOC's which then have the processor housed in the chip with many other things and so you have no idea what processor they are using (or even how many there are) without doing a lot more work.
Once you know the processor you can get what is called a decompiler for that code, assuming its not compressed or encrypted from the flash, you can use the decompiler for that processor on the machine code to convert it into assembly which then makes things much easier.
If you have what is often called the DAMOS, or ELF TYPES, or .AXF for the build you can then use that to map out where all the functions start in the assembly and/or where all the global variables are during execution time and that makes things WAY WAY WAY easier.
In the BMW world (where I've done some dabbling for example) the N54 engine uses a triple processor setup and all the function names and variables are in German, making it a PITA to figure things out sometimes, and we only have a very early version of the DAMOS that isnt complete.
If you can get all of those things together, there are sometimes decompilers that will even let you make source code (c or c++) out of the assembly, but that can get really confusing and ugly because you dont know what compiler they used to compile their code which then makes reversing assembly into a normal language rather hard. Best bet is to just learn the assembly for that processor and then look at the assembly code.
Once you have that, you can start screwing around inserting your own assembly code. If they didnt put any kind of protection against executing errant code your life is easy, but if they did you have to sometimes hide function pointers into things like IRQ's if they are not hard coded and the like to trick the processor into running your code, one great way is to sneak a function pointer into the return address on the stack of a known good function the code already calls (assuming its location on the stack is constant at some known time). If they used function pointers in their code a lot your job becomes a lot easier.
Table modification on the other hand is fairly simple, and even if they are CRC checking the tables you can re-calculate the CRC value and shove it in after you modify the table so the code is none the wiser that you changed it usually. Table modification usually lets you change things like AFR, injector duty cycle per RPM per MAF flow, etc, etc (all depends on how the ECM works). So often times these tables are in 2 or three dimensions which you have to look up how multidimensional arrays are laid out in a flat memory space (its not too complicated) to recognize them in assembly.
Hope that answered some questions.
|08-19-2017 04:18 PM|
ECM/TCM programming discussion SPIN OFF OF SPIN OFF
I have been hesitating starting a new thread on this for a couple of weeks, but I've noticed a little interest in the general computing topic (yes, also much distain too ) so here goes.
Pirate is the place to read from mechanical savants and other brilliant mechanical minded misfits, so hopefully there are the odd computing modders here too. If not there's probably none on any board www right?
I've not been on Pirate for 12 months or so, cos very busy with stuff, including head buried in a little bit of Computer Science study, at the lowest level engineering and low level programming end.
For the unlearned, ECM is the processor manages engine electronics, TCM auto transmission, they often share tasks. They're computers, controllers with micro chips etc.
With the little bit of auto ECM/TCM research I've done, they seem to be programmed in C or C++, or similar. Is this mostly true for the common brands?
Has anyone in here spliced into a decent modded ECM/TCM's 198 (there's a lot) odd wires and data logged, data compared all the wires individually, to see what's actually going on all through the operating of vehicle? Graphed/mapped it all out.
The actual sorce code won't be in the ECM/TCM, it would be in machine code, so you can't simply pull up the human understandable source coding on an IDE. Has anyone reverse engineered the machine code in here?
What pins on the actual ECM/TCM pin outs do you look for to connect your PC too?
There's tonnes of things to discuss. I'm mega interested in it.
A/ anyone in here who is a guru, might want to keep 'trade secret'.
B/ anyone in here who is knowledgable does it as a job, and totally not interested in talking about work here in relax time, does not want the attention.
C/ there may be no one in here.