Democratic Underground Latest Greatest Lobby Journals Search Options Help Login
Google

To my Friends who think there is too much data for the NSA to deal with...

Printer-friendly format Printer-friendly format
Printer-friendly format Email this thread to a friend
Printer-friendly format Bookmark this thread
This topic is archived.
Home » Discuss » Archives » General Discussion (01/01/06 through 01/22/2007) Donate to DU
 
hootinholler Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 03:26 PM
Original message
To my Friends who think there is too much data for the NSA to deal with...
The NSA was founded as an information aggregator to collect communications from foreign nations. Since its inception (I'm unsure of the date, but I'm pretty sure it was well established by the late 70's), it has been one of the big consumers of high powered computer hardware. Over the years they've bought really big boxen from IBM and Cray, middle class boxes like VAX and later Suns and untold numbers of Intel based processors. It is truly not an Enterprise class shop, rather, it's an Empire class shop. The investment in technology and the recruitment of the brightest brains in computer science is truly staggering.

I've seen a number of people here pooh-poo the 'wiretap' issue because the NSA doesn't have the bandwidth or capacity to read all of the email, or phone conversations, etc. The implication being that I'm not worried 'cause there's too much stuff for them to deal with.

I could whine on about it wasn't really a wiretap issue which is why they didn't go the FISA route, but, I'd rather point out that that argument is the answer to the wrong question. The appropriate question should be what portion of all that stuff do we have to deal with to obtain useful results? In other words, it's a matter of efficiency.

I have considerable professional experience in the Information Retrieval field. What I lay out here is not what the NSA has, but, rather, what I would set up given the premise that the system would analyze data streams on the order of a terabyte/hour.

The first thing to be captured is the information about the communication, known in the industry as meta-data, or data about data. This in and of itself is a useful byproduct of the preparatory, or grooming phase of the intake processing. However valuable this data is on its own, a major reason this data is needed is for use in the noise reduction of the communications stream being monitored, think spam filtering. Unlike us, trying to individually filter spam, the view of the traffic afforded by the meta-data would allow rejection of broadcast types of messages by an analysis of the traffic originated at the source. This traffic would still be cataloged, but probably not indexed, only archived.

The remaining traffic would be prioritized for indexing queues. Those digital sources with high priorities would be indexed in a matter of minutes from capture. Indexing yields another set of meta-data that is further analyzed. This set of data is about the content of the messages.

This new data is then inserted into the index collections and passed through filters for automated routing. Those filters are basically searches that have been saved to get new hits. These are not simple keyword searches, but are quite sophisticated.

Concepts expressed within the stream would be identified and parties would be marked as subject matter experts as the concept repeats in communications. Social and professional circles would be mapped with alacrity. A concept of interest thus becomes a person of interest which broadens into a circle of interesting parties.

The software to do this can be had by any one with enough money, off the shelf, today.

So, if you're not concerned because of the magnitude of the data set, I ask: "How efficient does this process need to be before you become concerned?" 1%? 10%? 50%?

Even this post is an answer to the wrong question because whatever they did do was done unconstitutionally. I think that even archiving spam violates the law.

-Hoot

Printer Friendly | Permalink |  | Top
Greyhound Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 03:31 PM
Response to Original message
1. Ignorance is, indeed, bliss. n/t
Printer Friendly | Permalink |  | Top
 
spindrifter Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 03:36 PM
Response to Original message
2. Thanks for sharing
your expertise with the rest of us. I have no doubt that they are efficient enough to make it worth their while. And it needs to stop.
Printer Friendly | Permalink |  | Top
 
Dr.Phool Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 03:39 PM
Response to Original message
3. According to James Bamford
Edited on Sat Dec-31-05 03:42 PM by Dr.Phool
In the 1980's, NSA's computers went from 5 and 1/2 ACRES to 11 ACRES, even though computers were drastically shrinking in size. Just imagine how much they've grown in the 20 years since.

on edit I must add, that they were capable of intercepting nearly every communication in the world back then. Think about now.

For more info read Bamfords books, "The Puzzle Palace", written in 1984 and "Body of Secrets" written in 2001.
Printer Friendly | Permalink |  | Top
 
Vitruvius Donating Member (1000+ posts) Send PM | Profile | Ignore Sun Jan-01-06 11:29 AM
Response to Reply #3
42. NSA has more than HALF the computer capacity in the entire world,
Edited on Sun Jan-01-06 11:44 AM by Vitruvius
measured in either MIPS or memory capacity; all of it optimized for searching massive volumes of data or searching enormous "keyspaces" (the latter for codebreaking). And they're world leaders in computer design & software; especially massively parallel machines. My best guess is that they're roughly a decade ahead of the state of the art that we 'civilians' see.

And their "computer farms" (their term) are even bigger today.

Years ago, I had an interesting experience; I was designing a new DSP (a special type of microprocessor optimized for signal processing and control applications) into one of my inventions and got one of the first development kits for that unit. When I got it, I was amazed at how mature the design was; most microprocessor families aren't that well perfected until they've been out 5 or 10 years; this was obviously an old-but-extremely-advanced microprocessor that had never been sold on the open market. When I nosed around the instruction set, I found a raft of undocumented instructions that were perfect for data searching and code-breaking applications. I needled a friend of mine at the manufacturer about how they'd obviously developed it for NSA and now were selling it to us civilians -- and tried to get him to tell me about -- and sell me -- the follow-on which they were obviously selling to NSA; I tried to make it easy for him by describing exactly what I thought the follow-on had. Three things happened: 1.) those extra instructions mysteriously disappeared from the production units; 2.) my friend called me up -- scared out of his wits -- and asked me to never say anything about this to anybody ever again; 3.)I got a visit from a rather wimpy spook who tried to scare the shit out of me.
Printer Friendly | Permalink |  | Top
 
FogerRox Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 03:42 PM
Response to Original message
4. I am not an expert-- I would like to offer the short version
1) Datamining software
2) SAICorp If I have the initials right
Printer Friendly | Permalink |  | Top
 
Al-CIAda Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 03:45 PM
Response to Reply #4
5. 3)Enormous black budget 10x that of the CIA (this is only what is admitted
Printer Friendly | Permalink |  | Top
 
hootinholler Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 05:14 PM
Response to Reply #4
18. SAIC would be running the machines and doing integration...
As well as others like Unisys, General Dynamics, EDS, etc. There isn't really datamining software per se. Datamining is somewhat better though of as an activity. The software that enables the activity depends on the nature of the data to be mined. In other words, mining financials requires different tools than mining documents, i.e. SAP versus Verity to give concrete vendors.

-Hoot
Printer Friendly | Permalink |  | Top
 
gulliver Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 03:53 PM
Response to Original message
6. I agree.
It's totally do-able.
Printer Friendly | Permalink |  | Top
 
DoYouEverWonder Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 03:54 PM
Response to Original message
7. Well if they are so damned smart
maybe they can figure out where that $2.3 billion that Rummie lost before 9-11 went?

And of by the way, while they're at it, where's Osama?

Printer Friendly | Permalink |  | Top
 
shadowlight Donating Member (135 posts) Send PM | Profile | Ignore Sat Dec-31-05 06:29 PM
Response to Reply #7
23. It was trillion. 2.3 Trillion.
Printer Friendly | Permalink |  | Top
 
DoYouEverWonder Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 06:36 PM
Response to Reply #23
24. 2.3 Trillion!
I think the number was so unbelievable that I assumed I had read it wrong.
Printer Friendly | Permalink |  | Top
 
hootinholler Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 06:39 PM
Response to Reply #24
25. Yes it was trillions, but they at the NSA aren't accountants ;) n/t
Printer Friendly | Permalink |  | Top
 
EuroObserver Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 07:26 PM
Response to Reply #25
26. US/UK trillions = International billions, I take it:
Edited on Sat Dec-31-05 07:28 PM by EuroObserver
ed: (what you call billions, we call thousand millions. SU standard billion is a million million - which I understand you call trillion).

2.3 million millions.

$ 2,300,000,000,000

Is that right?
Printer Friendly | Permalink |  | Top
 
hootinholler Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 07:42 PM
Response to Reply #26
27. Didn't know that.
That's the right number of zeroes, sure enough. It could be stated here as two trillion, three hundred billion.

-Hoot
Printer Friendly | Permalink |  | Top
 
eppur_se_muova Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 09:08 PM
Response to Reply #26
29. US billion = UK milliard, US trillion = UK billion, BOMK
Printer Friendly | Permalink |  | Top
 
TheBaldyMan Donating Member (1000+ posts) Send PM | Profile | Ignore Sun Jan-01-06 08:32 AM
Response to Reply #26
39. I think you mean 'milliard' = 1,000,000,000
usually in financial terms billion and trillion are taken to be the american value.
Printer Friendly | Permalink |  | Top
 
0rganism Donating Member (1000+ posts) Send PM | Profile | Ignore Sun Jan-01-06 12:54 AM
Response to Reply #25
36. As you pointed out earlier, that would be "financial data mining" ;)
And they wouldn't want to do that, right? Might get them a little too close to the guys who cleaned up with the put-options on United Airlines...
Printer Friendly | Permalink |  | Top
 
hootinholler Donating Member (1000+ posts) Send PM | Profile | Ignore Sun Jan-01-06 12:58 AM
Response to Reply #36
37. Or maybe too close to the
Turkish cash connection Sibel alluded to.

-Hoot
Printer Friendly | Permalink |  | Top
 
stevedeshazer Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 03:59 PM
Response to Original message
8. The Nazis catalogued every German using IBM punch cards on the '30s
http://www.ibmandtheholocaust.com/

IBM and the Holocaust is the stunning story of IBM's strategic alliance with Nazi Germany -- beginning in 1933 in the first weeks that Hitler came to power and continuing well into World War II. As the Third Reich embarked upon its plan of conquest and genocide, IBM and its subsidiaries helped create enabling technologies, step-by-step, from the identification and cataloging programs of the 1930s to the selections of the 1940s.

Only after Jews were identified -- a massive and complex task that Hitler wanted done immediately -- could they be targeted for efficient asset confiscation, ghettoization, deportation, enslaved labor, and, ultimately, annihilation. It was a cross-tabulation and organizational challenge so monumental, it called for a computer. Of course, in the 1930s no computer existed.

But IBM's Hollerith punch card technology did exist. Aided by the company's custom-designed and constantly updated Hollerith systems, Hitler was able to automate his persecution of the Jews. Historians have always been amazed at the speed and accuracy with which the Nazis were able to identify and locate European Jewry. Until now, the pieces of this puzzle have never been fully assembled. The fact is, IBM technology was used to organize nearly everything in Germany and then Nazi Europe, from the identification of the Jews in censuses, registrations, and ancestral tracing programs to the running of railroads and organizing of concentration camp slave labor.

IBM and its German subsidiary custom-designed complex solutions, one by one, anticipating the Reich's needs. They did not merely sell the machines and walk away. Instead, IBM leased these machines for high fees and became the sole source of the billions of punch cards Hitler needed.
=====================================================================================================================================
To make matters worse, Prescott Bush was their banker and financier.
Printer Friendly | Permalink |  | Top
 
rman Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 04:03 PM
Response to Reply #8
11. IBM doesn't deny it, but says it wasn't as bad as it's made out to be
see www.thecorporation.com
Printer Friendly | Permalink |  | Top
 
rman Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 04:00 PM
Response to Original message
9. Just think of what google does;
searching millions of pages in mere seconds.

From www.google.com/technology :
"Google runs on a unique combination of advanced hardware and software. The speed you experience can be attributed in part to the efficiency of our search algorithm and partly to the thousands of low cost PC's we've networked together to create a superfast search engine."

Then think of what a secret agency could do.
Printer Friendly | Permalink |  | Top
 
hootinholler Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 06:16 PM
Response to Reply #9
21. Google is solving a different but similar problem.
They have the added complexity of serving the results worldwide. They also don't do the topic analysis that I'm aware of.

In terms of raw horsepower it would not surprise me to learn that the throughput of the NSA's server farms is vastly superior to Google's capability by orders of magnitude.

I'm also sure google is drawn upon to corroborate and enhance persons of interest. I wouldn't be surprised to find a Google server or three on their backbone, or the CIA's or the DIA's or DHS's yadda yadda.

-Hoot
Printer Friendly | Permalink |  | Top
 
w13rd0 Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 04:02 PM
Response to Original message
10. I've tried explaining this as well...
...but I think unless you use analogies people can relate to, they just don't get it.

You go into a grocer and look at the apple pile. Amongst the apples, you see a couple that are bruised or discolored. If the number that are bruised or discolored is sufficiently high, you might consider the whole batch worthy of further quality inspection. You don't have to cut everey apple open to make this determination. You don't have to even cut one of them open. So you might use this on a "meta-scale" to seperate the unbruised from the bruised. Then among the bruised, you might cut one or two open in each batch and determine if it's trully a "bad apple". If it is, that batch might warrant further scrutiny. If it's not, you cut open another two or three from the same batch. If among those, one or more are bad, you say, ok, this batch is bad, and into the trash it goes.

With data abstraction and collection, the larger the sampling, the more data (the more apples), the more "efficient" your methodology becomes.

Now, even though it becomes more efficient, it's not 100%. So you are going to be throwing out one or two good apples with the bad. And in our country, that's not permissible. Even one innocent person getting the death sentence is one too many. Our constitution makes specific reference to us having rights AS INDIVIDUALS, and these practices VIOLATE those rights.

And indeed, this also ensures that ways will (and indeed already have) been found around this.

A message with one recipient will receive more scrutiny, so I'll encode my "secret message" in an image, with the message text being a simple cipher wrapped in encryption, and I'll email that image in a greeting card I send out to 20 or a hundred people.

The very nature of the technology can be used against it. Look at how spam methogologies improve over time and find ways around your filters. In the same manner, criminal organizations find ways around technologies they know are being used against them. Many corporations now employ VPNs and encrypted mail. Cases are being prosecuted for corporations keeping "two sets of books" as a way to get around account auditting.

Our society generates a great deal of "white noise". And there will be ways that others exploit this.
Printer Friendly | Permalink |  | Top
 
dusmcj Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 04:15 PM
Response to Original message
12. possible? gather it all, search on demand ? with a warrant bwahahahaha
this made me wonder if a fundamental paradigm shift is possible, or rather, has already occurred: instead of getting a warrant to surveil a particular party, and then collecting communications associated with them, are we now collecting all communications, tagging them with the metadata you mentioned, and filtering/searching on demand. So that, a warrant is issued to surveil Billy Bob's communications, and all that has to happen is that the already-gathered comms are searched for content of interest ? Remember, I said "already collected".
Printer Friendly | Permalink |  | Top
 
hootinholler Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 05:04 PM
Response to Reply #12
17. I think that is part of what they are doing...
The collection and archival alone is illegal. The other thing I suspect they are doing is identifying people for further scrutiny.

-Hoot
Printer Friendly | Permalink |  | Top
 
EST Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 04:26 PM
Response to Original message
13. Thanks, beautifully done.
I, too, have attempted to demonstrate the scale of operations and the fact that it is an easy (comparatively) job, for any organization of sufficient size.
My own experience goes back to the IBM 1400 and 7000 series and I know for a fact that the CIA got some of the early mass storage devices-IBM2321-as soon as they became available in the 70s and even at CIA, with their massive computer systems, they spoke in awe about the really huge stuff NSA used. This was 30 years ago.
Printer Friendly | Permalink |  | Top
 
hootinholler Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 04:48 PM
Response to Reply #13
15. Thanks, this is a really simplistic view.
I'm ignoring a number of feedback loops from actual analysts that would tend to further refine the targeting of the stream and other aspects but, I hoped to be understandable to those with little understanding of the technology.

It's truly amazing the contributions to CS that have been made by NSA affiliated scientists.

-Hoot
Printer Friendly | Permalink |  | Top
 
formercia Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 04:37 PM
Response to Original message
14. I first visited the NSA in 1972
and it was a big operation then.......
Printer Friendly | Permalink |  | Top
 
bleever Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 04:53 PM
Response to Original message
16. Rec'd. Thanks for the great info. n/t
Printer Friendly | Permalink |  | Top
 
CrispyQ Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 06:04 PM
Response to Original message
19. Beware Fat Boy & Llewellyn. --nt
Printer Friendly | Permalink |  | Top
 
hootinholler Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 06:17 PM
Response to Reply #19
22. Sorry, I don't get the references.
:shrug:

-Hoot
Printer Friendly | Permalink |  | Top
 
CrispyQ Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 09:21 PM
Response to Reply #22
30. Sorry. I should have explained.
It's from the book "Shibumi" by Trevanian. The govt has a massive database, Fat Boy, with so much information that it takes an especially talented person, Llewellyn, to know how to phrase inquiries so as to retrieve enough relevant data, yet not inquire too deeply so as to be overwhelmed with data. A most excellent piece of fiction!
Printer Friendly | Permalink |  | Top
 
Buns_of_Fire Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 06:06 PM
Response to Original message
20. And if you networked the NSA, CIA, and Google computers all together...
...betcha still couldn't find any trace of integrity in boosh*...
Printer Friendly | Permalink |  | Top
 
Pithy Cherub Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 09:03 PM
Response to Original message
28. It's not simply WHAT hardware the NSA has, but WHO they hire.
Edited on Sat Dec-31-05 09:05 PM by Pithy Cherub
They have some of the best minds (think mathmeticians)on the planet and reward them handsomely for their work. All that technology *stuff* has to be made to run and your average American is not aware that NSA is paying high dollars for not only the administration of this - but just think of Research & Development. The talent inside the organization is vast and crosses many disciplines beyond technology. The linguists, cryptologists,and analysts (economic and intelligence)are there in numbers the CIA, FBI and Defense Department can only dream about. They also specialize in things we can only guess about.

And nobody is watching the watchers...
Printer Friendly | Permalink |  | Top
 
hootinholler Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 10:43 PM
Response to Reply #28
31.  And nobody is watching the watchers
Well, the NSA recieves more oversight than the DIA who is running similar programs, although the ones we know about are apparently legal (Abel Danger).

Who knows what oversight the new UberSecurityChief gets.

-Hoot
Printer Friendly | Permalink |  | Top
 
Terran1212 Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 10:45 PM
Response to Original message
32. My sister has had many interviews with the NSA
She said in one (although this may've been the Embassy people), the interviewers kept saying "Insha'alah" (we're Muslims).

She currently does Homeland Security work in Atlanta.

I think she wants to work for every creepy spy agency that will give her a job interview : p
Printer Friendly | Permalink |  | Top
 
bpilgrim Donating Member (1000+ posts) Send PM | Profile | Ignore Sat Dec-31-05 10:47 PM
Response to Original message
33. all the data is stored on all corp americas servers - and they got access
to them as well as their own server farms.

IBM is also very good at data mining, not to mention GOOGLE.

think about it...

peace
Printer Friendly | Permalink |  | Top
 
hootinholler Donating Member (1000+ posts) Send PM | Profile | Ignore Sun Jan-01-06 12:38 AM
Response to Reply #33
34. Well, I really don't think that's practcal or advisable
Even to the neocons. We're talking about being the fly on the wall here. It would be way too unreliable to depend on external servers. If my knowledge of secured sites is correct they have no direct connection to the outside world it all comes in via logged media.

There are private companies aggregating data, but not message traffic, at least legally.

Now I believe that Corporate Amerika would be very receptive to providing feeds to most any three letter agency.

-Hoot
Printer Friendly | Permalink |  | Top
 
bpilgrim Donating Member (1000+ posts) Send PM | Profile | Ignore Sun Jan-01-06 12:47 AM
Response to Reply #34
35. 'message traffic' is what some of biggies do for a living
and they leave the door open for big bro to listen in.

studying their archived data is even easier.

peace
Printer Friendly | Permalink |  | Top
 
TheBaldyMan Donating Member (1000+ posts) Send PM | Profile | Ignore Sun Jan-01-06 05:02 AM
Response to Original message
38. I have a link here to a report on intercept capabilities dated 2000
Edited on Sun Jan-01-06 05:03 AM by TheBaldyMan
that was 5 years ago, capabilities have probably advanced a fair way by now.

Interception Capabilities 2000

edited for typos
Printer Friendly | Permalink |  | Top
 
hootinholler Donating Member (1000+ posts) Send PM | Profile | Ignore Sun Jan-01-06 10:53 AM
Response to Reply #38
41. Thanks for the link! n/t
Printer Friendly | Permalink |  | Top
 
Silverhair Donating Member (1000+ posts) Send PM | Profile | Ignore Sun Jan-01-06 08:39 AM
Response to Original message
40. NSA was established in 1952. NT
Printer Friendly | Permalink |  | Top
 
DU AdBot (1000+ posts) Click to send private message to this author Click to view 
this author's profile Click to add 
this author to your buddy list Click to add 
this author to your Ignore list Tue Apr 30th 2024, 12:04 AM
Response to Original message
Advertisements [?]
 Top

Home » Discuss » Archives » General Discussion (01/01/06 through 01/22/2007) Donate to DU

Powered by DCForum+ Version 1.1 Copyright 1997-2002 DCScripts.com
Software has been extensively modified by the DU administrators


Important Notices: By participating on this discussion board, visitors agree to abide by the rules outlined on our Rules page. Messages posted on the Democratic Underground Discussion Forums are the opinions of the individuals who post them, and do not necessarily represent the opinions of Democratic Underground, LLC.

Home  |  Discussion Forums  |  Journals |  Store  |  Donate

About DU  |  Contact Us  |  Privacy Policy

Got a message for Democratic Underground? Click here to send us a message.

© 2001 - 2011 Democratic Underground, LLC