Apollo AGC computing 'power'

Status
Not open for further replies.
D

dan_h

Guest
Hello, I would like to ask a question about the Apollo missions and in particular the AGC. Simply put; how many operations per second could this early computer actually do? <br /><br />I have scoured the web and whilst there are copious amounts of data on the clock speed, the weight, dimensions, ram, rom and the fiendishly complex assembler code used to program it, there appears to be nothing on operations.<br /><br />The reason for asking is that I am constantly hearing that product x is x million times better at FLOPS than the Apollo guidance system but I can't for the life of me find out where or how these people are calculating the FLOPS for the AGC!<br /><br />Many thanks for easing my pain :)
 
D

docm

Guest
Can't find a MIPS or FLOPS rating but modern pocket calculators, far faster than AGC, run 5-20 FLOPS or better. <div class="Discussion_UserSignature"> </div>
 
B

bobw

Guest
I found something that tells about the speed of regular math instructions. <br /><br />http://klabs.org/history/history_docs/mit_docs/1009.pdf<br /><br />Memory Cycle Time 11.7 μsec<br />Add Instruction time 23 μsec<br />Multiply (excluding Index) 93.6 μsec<br />Double Precision Add (X+x)+(Y+y)=(Z+z) 234 μsec<br />Double Precision Multiply 971.1 μsec<br />Counter Incrementing 11.7 μsec<br /><br />My guess is that the AGC didn't actually use a floating point data storage format or it would have shown up in documents like this. I have read that the radar output was pulse rate modulated so distances were probably stored as integer number of pulses/sec, for example, and converted to floating point for operator displays only (if that, not sure).<br /><br />I suppose you could look up the most efficient floating point emulation machine language routines and calculate the total time it would take. Most likely the FLOP factor is just basic clock speed differences and maybe multiplied by hardware math co-processor efficiency if the websites you read with the "X a billion" are hardcore. <div class="Discussion_UserSignature"> </div>
 
D

dan_h

Guest
Thanks for that I'll read it through :) <br /><br />Edit: So from what I can tell (and please correct me if I'm wrong, my speciality lies more in the field of advanced speculation than computing), the MIPS for the AGC would be:<br /><br />MIPS = Clock Rate / (CPI * 10e6)<br /><br />[Where CPI is Cycles Per Instruction]<br /><br />CPI = Memory cycle time (11.7μs) / Add instruction time (23μs)<br /><br />Therefore; MIPS = 2.048MHz / ((11.7/23)*10e6)<br /><br />Which is about 4 MIPS. That seems about right to me but I'm unsure that my calculation of the CPI is correct. <br /><br />I'm also wondering if their is a precedent for assuming that 1 MFLOPS = 1 MIPS...? <br /><br />n.b. Clock speed lifted from page 21 of http://www.spaceref.com/exploration/apollo/acgreplica/buildAGC1.pdf
 
H

heyscottie

Guest
Your calculation for CPI is not correct. Cycles in this context refers to core clock cycles. I would expect that this machine was a single cycle, rather than multi cycle machine, which would give a CPI of 1. I could be wrong in that assumption, though.<br /><br />There is not a good precedent for assuming that 1 MFLOPS = 1 MIPS, especially in a machine like this one. 1 MFLOPS = 1 MIPS only if floating point instructions can be done in a single cycle, at the same speed as the other instructions. This is unlikely. In a processor where floating point is emulated by calling routines that do fixed point math, as I suspect this one is, you will have MANY instructions running to do one floating point calculation.<br />Did you see the double precision multiply time of almost 1 ms? That's about 2000 clock cycles for one multiply! I'd be interested in seeing what the single precision numbers are; you could go from that right to a FLOPS count. But I can't imagine it would be much better than about 10 KFLOPS.
 
D

dan_h

Guest
I am pleased I did not go for scientific computing now, Fortran was bad enough without going into this confusing world of clock cycles and MHz!<br /><br />heyscottie (or any other helpful chap/chapess) could you explain your last statement, preferably as though you were speaking to a somewhat slow but eager child?<br /><br />I did see the double multiply times but I assumed that was just the way they liked to compare computing power back then. I can't even fathom how you have calculated the 2000 cycles even though I know all of the information is right there in front of me.
 
H

heyscottie

Guest
The clock rate is 2.048 MHz. The double precision multiply time is ~ 1 ms.<br /><br />2.048e6 cycles/sec * 1e-3 sec = 2048 clock cycles<br /><br />Since no native instruction could take that long, we deduce that double precision math is emulated with software.<br /><br />And Fortran was never my favorite language, either... <img src="/images/icons/smile.gif" />
 
B

bobw

Guest
<font color="yellow">Since no native instruction could take that long, we deduce that double precision math is emulated with software. </font><br /><br />Right on.<br /><br />The GNC was doing pretty good to even have a multiply instruction. The processors for famous early computers like the Apple II, Commodore 64, and TRS-80 didn't have multiply instructions, they did it in software by bit shifts and/or sequential addition. The 8088 processor for the IBM XT had a multiply instruction that took ~75 clock cycles to multiply two 8 bit numbers and store the answer in the accumulator. <div class="Discussion_UserSignature"> </div>
 
D

dan_h

Guest
That is embarrassingly simple. I am feeling suitably foolish...<br /><br />Interesting that something so basic had to be software driven.<br /><br />So if I can find out what the single precision figures were for add and multiply then I can extrapolate a shaky FLOPS calculation from there?
 
C

CalliArcale

Guest
<blockquote><font class="small">In reply to:</font><hr /><p>Interesting that something so basic had to be software driven. <p><hr /></p></p></blockquote><br /><br />We take it for granted that our modern processors can have so much built right into their circuitry, but that's only possible because modern circuits can be so tiny, use so little power, and generate so little heat (relatively speaking). Though it's slower to use software to perform those operations, in older computers that was the only way it could be done at all because you simply couldn't afford the space, mass, power consumption, or heat associated with all that circuitry. <div class="Discussion_UserSignature"> <p> </p><p><font color="#666699"><em>"People assume that time is a strict progression of cause to effect, but actually from a non-linear, non-subjective viewpoint it's more like a big ball of wibbly wobbly . . . timey wimey . . . stuff."</em>  -- The Tenth Doctor, "Blink"</font></p> </div>
 
H

heyscottie

Guest
No need to feel foolish; I am a computer engineer, or I probably wouldn't be able to answer many of these questions myself.<br /><br />Even now, many embedded processors still emulate floating point math. Native floating point units are expensive (transistor and power-wise), and cause the core to be more complicated. If an application is doing mostly fixed point math (and there are MANY that do), then using a fixed point processor that emulates floating point is probably better. If you are doing a lot of floating point, though, that hits you pretty hard.<br /><br />Yes, if you can find times for single precision adds and multiplies, I'd say you could estimate a FLOPS count. There's also the question of how good the processor is at getting data into the arithmetic units in the first place, but I'd say those effects would be dwarfed by the add and multiply times.
 
D

dan_h

Guest
Foolish might be my middle name <img src="/images/icons/smile.gif" /> I spent 7 years as an astrophysics student and then 3 in the space research industry but it's all just too closely linked to advanced computing which I dropped early on as it..., well, it was very hard! Besides I find that astronomy as a hobby contains all the beauty with none of the repercussions when you accidentally forget to carry the 0.2. Plus there is time to actually look at the awesome majesty of nature which never seemed to happen when I poked at it for a living. So ignorance, in my case, can be bliss <img src="/images/icons/wink.gif" /><br /><br />Anyway getting back on topic; after some pretty long Apollo related searches instead of spending time actually working, I have finally found the single precision times, hiding under an MIT related website;<br /><br />Addition Time: 23.4 μs<br />Multiplication Time: 93.6 μs<br /><br />Edit: I now stumble across these figures in the Eldon Hall document posted above. My attention to detail is becoming wanton.<br /><br />Now I shall pause to actually do some work before attempting to find out exactly how this information might be of use to me in the hunt for FLOPS...
 
D

dan_h

Guest
Could someone check my logic for me?<br /><br />Time for one FLOP ~<br /><br />Memory Cycle Time (11us)<br />+<br />Clock Pulse Time (0.49us)<br />+<br />Single Precision instruction Time (variable. either 23.4us or 93.6us) <br />+<br />Clock Pulse Time (0.49us)<br />+<br />Counter Increment Time (11.7us)<br />+<br />Clock Pulse Time (0.49us)<br /><br />So I ran a little program that randomly chooses a type of operation to execute 5000 times and I get an average FLOP time of ~82.9us. <br /><br />Hence an estimate of 12,000 FLOPS....?
 
H

heyscottie

Guest
Well, we don't really know the architecture yet. It is possible that it could be loading from memory at the same time it is running the math instructions. And I'm not sure where Counter Increment Time and Clock Pulse Time would come into these calculations.<br /><br />Without further knowledge, I would just use your 23.4 us or 92.6 us numbers.<br /><br />Most chipmakers, of course, will quote the time for their fastest operation for FLOPS, so they would use the add time of 23.4 us, giving 42700 FLOPS. This is the number that it is probably most fair to compare against data sheets on today's processors. It assumes that you can be doing adds all the time, and that adds are all you would ever want to do. Unrealistic, of course, but that's generally how they are specified.
 
D

dan_h

Guest
Thanks for your help.<br /><br />I don't suppose it makes a wild amount of difference really with current processors counting terra rather than kiloFLOPS!
 
H

heyscottie

Guest
Indeed. I should point out, though, that most processors are in the Giga FLOPS range. Only supercomputers, generally made up of farms of smaller machines, tend to aggregately get into the Terra FLOPS range yet.<br /><br />But in any case, the AGC just doesn't come close...
 
A

adrenalynn

Guest
I haven't read the pdf posted yet - I'm just now downloading it, but pipelining techniques (both in processor and in firmware/software) would skew those numbers pretty heavily too, wouldn't they? <div class="Discussion_UserSignature"> <p>.</p><p><font size="3">bipartisan</font>  (<span style="color:blue" class="pointer"><span class="pron"><font face="Lucida Sans Unicode" size="2">bī-pär'tĭ-zən, -sən</font></span></span>) [Adj.]  Maintaining the ability to blame republications when your stimulus plan proves to be a devastating failure.</p><p><strong><font color="#ff0000"><font color="#ff0000">IMPE</font><font color="#c0c0c0">ACH</font> <font color="#0000ff"><font color="#c0c0c0">O</font>BAMA</font>!</font></strong></p> </div>
 
H

heyscottie

Guest
Yes. There is much I don't know about the architecture, so I am making guesses. But I expect when a single precision floating point add time of ~20 us is quoted, then it is clearly a software addition. And unless there are multiple cores, which I highly doubt, then there's not really much, if any, software pipelining or parallelization that can occur.<br /><br />Even if the core supports instruction pipelining, which it may, we still have the ~20 us quoted, and there's really no way around that...
 
A

adrenalynn

Guest
Agreed. At that point, it's all in how you make it "feel", which comes down to the engineer(s) coding it.<br /><br />Besides, we survived a lot of years and did a lot of really cool stuff without any FPU at all, as you mention with software emulated instructions. I remember contributing to FractInt - an entirely integer-based fractal visualization application. I started losing interest when I got an 80287 FPU and was seeing close to 64kflops. <img src="/images/icons/wink.gif" /><br /><br />And that brings up another point - there are so many incredibly good embedded solutions out there these days, with near zero power and cooling requirements (such as the '386/'387) that one has to wonder why we need to run out and build another solution. Unless it's heavily optimized for some very specific task. But then that leads me to start wondering why one doesn't just start leaning on another existing architecture. In my video compression world, I architected a system around FPGAs (Field Programmable Gate Arrays) for heavy compression tasks. For someone like NASA, it seems ideal. Build it once and then retask/repair (ie. debug) it on the fly. It's also extremely extensible and parallizable. (I'm sure google spell-check will complain about that "word", but you get my meaning...)<br /><br />There's some IP around that, some of it I authored, but clearly there appears plenty of room for NASA to get in there for their tasks. And given the money they throw around, licensing and even purchasing any IP in their way should be trivial.<br /><br />Sometimes it feels to me like they just run around re-inventing the wheel, either as busy-work, or because some division or another feels the need to empire-build.<br /><br />If we really want to reach for the stars it makes more sense to me to creatively leverage the things we've learned and built in the past rather than trying to reinvent the entire whole of human history every time we want to toss a bird out of the atmosphere.<br /><br />Ok, I <div class="Discussion_UserSignature"> <p>.</p><p><font size="3">bipartisan</font>  (<span style="color:blue" class="pointer"><span class="pron"><font face="Lucida Sans Unicode" size="2">bī-pär'tĭ-zən, -sən</font></span></span>) [Adj.]  Maintaining the ability to blame republications when your stimulus plan proves to be a devastating failure.</p><p><strong><font color="#ff0000"><font color="#ff0000">IMPE</font><font color="#c0c0c0">ACH</font> <font color="#0000ff"><font color="#c0c0c0">O</font>BAMA</font>!</font></strong></p> </div>
 
H

heyscottie

Guest
But remember that NASA has an extra set of requirements to worry about.<br /><br />One of these is that it flies in space. In FPGAs, for example, we have to worry about single event neutron upsets. We don't even like to fly FPGAs high up in the atmosphere, much less in space. I'm not saying it can't be done, but additional shielding and system robustness must be built in.<br /><br />In addition, there's the system reliability requirement. NASA does NOT want to take any chances. This is why they rarely will use the newest, fastest, most cutting edge processing equipment -- they want something a generation or two back, or something that was designed specifically with their requirements in mind.<br /><br />While it does lead to inefficiencies, greater cost, less capability, etc than could be otherwise achieved, we must remember to balance what we consider to be requirements with the requirements for the mission as a whole.
 
A

adrenalynn

Guest
Well, I don't think I would have placed an embedded 386/387 set in the "latest and greatest" column. Flying 1960's tech. has at least as many downsides. A 386/387 would have to fall into "a few generations old", wouldn't it?<br /><br />Finding shielding for a 386/387 set isn't all that tough. Private industry can handle dropping 'em into a radioactive core that would toast a human in an instant.<br /><br />FPGA-based systems have been in some pretty spectacular environments too. And again have the option of being nearly instantly reflashed if there is an issue, and providing mass redundancy in design.<br /><br />I'm a bit unconvinced.<br /><br />[edit to add: I know of at least one bird up there flying FPGAs. Without 'em, HDTV would be a lot less pretty. Due NDAs, I'm probably advised to leave that about there...] <div class="Discussion_UserSignature"> <p>.</p><p><font size="3">bipartisan</font>  (<span style="color:blue" class="pointer"><span class="pron"><font face="Lucida Sans Unicode" size="2">bī-pär'tĭ-zən, -sən</font></span></span>) [Adj.]  Maintaining the ability to blame republications when your stimulus plan proves to be a devastating failure.</p><p><strong><font color="#ff0000"><font color="#ff0000">IMPE</font><font color="#c0c0c0">ACH</font> <font color="#0000ff"><font color="#c0c0c0">O</font>BAMA</font>!</font></strong></p> </div>
 
H

heyscottie

Guest
Oh, I agree that solutions exist. I am merely pointing out that mission-critical circuits need extra attention when they are flying in space, and usually can't end up being the same circuit you would have designed if it was on the ground. This leads to increased cost, older technology, fewer off the shelf options, decreased interoperability, etc. It's not just because NASA loves reinventing the wheel. I assure you that engineers are always looking for the simplest, cheapest, most robust, most elegant solutions that still meet the requirements.
 
Status
Not open for further replies.

Similar threads

Latest posts