(I apologize if this post is obscure. It's more of a "meta-answer" than an answer.)
The term 'JPEG' (Joint Photographic Experts Group) refers to the 'lossy' algorithm with which the image is compressed. Embedding this data within a file, one must consider whether it's a JFIF (JPEG File Interchange Format) or SPIFF (Still Picture Interchange File Format) formatted file, where SPIFF is an extension of JFIF.
What we're dealing with here is data, file formats, and software ... NOT photography.
TASI has a decent summary-level tutorial at
http://www.tasi.ac.uk/advice/creating/fformat.htmlAn off-the-cuff description of the layout of a JFIF or SPIFF file is beyond my expertise. As a (former?) software engineer, I always rely on published standards and specifications for such technical issues, and even for those I've worked with, I'd be incapable of describing them without making use of the reference resources. (The web isn't really enough for this.) I
always relied on reference resources (IEEE, ANSI, ISO, etc.). The standards and specification for JFIF and SPIFF would fall under the category of "Data Interchange Standards", the text of which typically employs "C" language snippets to describe data layout. Relevant standards for this would include "ISO/IEC 10918-4:1999---Joint Photographic Experts Group (JPEG)".
I again apologize for the 'smoke'n'mirrors' language, but if I were even beginning to work with such file formats, I'd be assembling a virtual bookshelf of relevant texts. I can easily imagine the minimum number of 500-page texts needed to work with just the 'JPEG' formats to be around 8-10, dense with code snippets and block diagrams.