Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

Ohio Joe

(21,756 posts)
Fri Feb 8, 2013, 11:10 AM Feb 2013

The inside story of Aaron Swartz’s campaign to liberate court filing

Years before the JSTOR scraping project that led to Aaron Swartz's indictment on federal hacking charges—and perhaps to his suicide—the open-data activist scraped documents from PACER, the federal judiciary's paywalled website for public access to court records. (The acronym PACER stands for Public Access to Court Electronic Records, which may sound like it's straight out of 1988 because it is.) Swartz got 2.7 million documents before the courts detected his downloads and blocked access. The case was referred to the FBI, which investigated Swartz's actions but declined to prosecute him.

A key figure in Swartz's PACER effort was Steve Schultze, now a researcher at Princeton's Center for Information Technology Policy. Schultze recruited Swartz to the PACER fight and wrote the Perl script Swartz modified and then used to scrape the site.

Until recently, Schultze has been quiet about his role in Swartz's PACER scraping caper. But Swartz's death inspired Schultze to speak out. In a recent phone interview, Schultze described how Swartz downloaded gigabytes of PACER data and how that data has been put to use throughout the last four years. Schultze told us he hopes the outrage over Swartz's death will provide momentum for legislation to finish the job Swartz and Schultze started almost five years ago: tearing down PACER's paywall.

http://arstechnica.com/tech-policy/2013/02/the-inside-story-of-aaron-swartzs-campaign-to-liberate-court-filings/

1 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
The inside story of Aaron Swartz’s campaign to liberate court filing (Original Post) Ohio Joe Feb 2013 OP
I use RECAP every day jberryhill Feb 2013 #1
 

jberryhill

(62,444 posts)
1. I use RECAP every day
Fri Feb 8, 2013, 11:26 AM
Feb 2013

The legal profession owes a great debt to this project, and it's worth providing some further detail on how this works.

The federal court PACER system charges eight cents per page for downloading .pdf documents from the online docket. With the RECAP plug-in installed (RECAP is PACER backwards), if you are the first one to download a court document, RECAP captures the download and immediately uploads a copy of it to archive.org. The RECAP plugin also looks at the HTML stream from the PACER docket output, detects whether any item on that docket has been previously captured, and provides an alternate download link to retrieve the document from archive.org instead of directly from the PACER system.

There have been a lot of threads at DU over the years in which a news report of a lawsuit was the topic of discussion, and I've been able to download the relevant documents from PACER, and then provide the direct archive.org link for posting in the DU thread. That kind of access to court documents was unthinkable just a few years ago.

It's always interesting when looking at a PACER docket for the first time, to see whether another RECAP user has been there first, as indicated by the RECAP download links on the docket view. You can't, of course, tell who it was, but it's like seeing footprints in the snow.

The only problem I've noted thus far, and I don't know if they've addressed it, is that there are sometimes papers filed in a case which violate certain rules of the federal courts, such as not redacting social security numbers of persons identified in the documents. Normally, the court clerks will catch that after a while, remove the document from the PACER docket, and order the filing party to upload a document with appropriate redactions. In that time window, the original document without redactions has already been RECAPed and remains in circulation. A classic violator of that rule is none other than Orly Taitz, who has been admonished by at least two courts so far to stop submitting non-redacted documents, but she continues to do it on purpose because she knows that third party personal information she includes in her court filings will be downloaded and disseminated before the clerks or the judge catch up to her.


http://www.courthousenews.com/2011/07/28/38543.htm

"Plaintiff should be aware, however, that repeated violations of the rules are in fact sanctionable, even sua sponte," he wrote. "Moreover, wasting the court's time with nonsense is not the way for plaintiff to have any hope of prevailing in this case."

When confronted with her repeated errors, Lamberth said Taitz made a "somewhat hysterical claim for reconsideration" and said a court employee was "intentionally sabotaging" her.

On Monday, Taitz sent another version of her opposition to the court with the last four digits of the Social Security number redacted yet again.

"Plaintiff is either toying with the court or displaying her own stupidity," Lamberth wrote. "She made the correct redactions when she re-filed her complaint and amended complaint. There is no logical explanation she can provide as to why she is now wasting the court's time, as well as staff's time, with these improper redactions."


The "logical explanation" that Judge Lamberth is missing, is that she knows full well that her improperly-redacted papers will be flying around the internets long before she gets called on violating the relevant rule.

Latest Discussions»General Discussion»The inside story of Aaron...