Democratic Underground Latest Greatest Lobby Journals Search Options Help Login
Google

question about GNU 'sort'

Printer-friendly format Printer-friendly format
Printer-friendly format Email this thread to a friend
Printer-friendly format Bookmark this thread
Home » Discuss » DU Groups » Computers & Internet » Open Source and Free Software Group Donate to DU
 
Syrinx Donating Member (1000+ posts) Send PM | Profile | Ignore Sun Aug-03-08 03:35 AM
Original message
question about GNU 'sort'
Edited on Sun Aug-03-08 03:40 AM by Syrinx
I'm trying to sort some Usenet headers with GNU 'sort,' and
I'm having a big problem.

I want 'sort' to consider all the characters in each record,
but the program is acting like I'm invoking it with the '-d'
option (consider only blanks and alphanumeric characters), but
I'm not.

For example if I feed sort the following example lines
(between the rows of dashes):

-------------------
(3
(1
{2
[1
[3
(2
{1
{3
[2
-------------------

I get back the following:

-------------------
(1
[1
{1
(2
[2
{2
(3
[3
{3
-------------------

Instead, I need to get this:

-------------------
(1
(2
(3
[1
[2
[3
{1
{2
{3
-------------------

I've looked over the man page, and didn't see anything to
correct this behavior, but I'm undoubtedly overlooking
something.

Can someone help, please?

And while I'm at it, does anyone know of a program like GNU
sort that offers some sort of progress indication, so that I
could send a progress report back to my main program through a
pipe or something?

Thanks!


Refresh | 0 Recommendations Printer Friendly | Permalink | Reply | Top
Syrinx Donating Member (1000+ posts) Send PM | Profile | Ignore Mon Aug-04-08 12:37 AM
Response to Original message
1. turns out it was a LOCALE problem
export LC_ALL=C

And now sorting works as expected.
Printer Friendly | Permalink | Reply | Top
 
RoyGBiv Donating Member (1000+ posts) Send PM | Profile | Ignore Mon Aug-04-08 12:43 AM
Response to Reply #1
2. I'm glad you figured that out ...

'Cause I got obsessed with it, and it was driving me nuts.

I spent some time this afternoon trying to work it out, and before I knew it I had put together a script that was close to doing it when I realized it was so complicated it defeated the purpose of even having sort.

Printer Friendly | Permalink | Reply | Top
 
Syrinx Donating Member (1000+ posts) Send PM | Profile | Ignore Mon Aug-04-08 05:53 AM
Response to Reply #2
3. sorry for causing an obsession!
:)

In my quest for progress indication, I've decided that I can spawn a process (in python) with the "subprocess" module, in a subthread, and go into the 'tmp' directory and measure the sizes of the temporary files created by the sort program, as they are created.

But my data file is delimited by null characters, and python is complaining about that.

"argument 1 must be (encoded string without NULL bytes), not str"

I'm not sure I'm communicating effectively. :)

How do you send a string to subprocess.Popen including a null character?

Thanks in advance if you can answer this. :hi:
Printer Friendly | Permalink | Reply | Top
 
RoyGBiv Donating Member (1000+ posts) Send PM | Profile | Ignore Mon Aug-04-08 07:32 PM
Response to Reply #3
4. Sorry ...

I don't do Python.

I just dabble but am not an actual programmer.

Printer Friendly | Permalink | Reply | Top
 
Dogmudgeon Donating Member (1000+ posts) Send PM | Profile | Ignore Fri Oct-31-08 08:50 AM
Response to Reply #3
5. I've never run into that
Admittedly, I'm a relative n00b when it comes to Python, but I've done a fair bit of corpus (linguistic) analysis with Python and have never had a problem with null-terminated text files, no matter what end-of-line characters were used. So, for me, it handles Chr = 0, 10, and 13.

Perhaps it's another config issue like the localization problem ... ?

--p!
Printer Friendly | Permalink | Reply | Top
 
DU AdBot (1000+ posts) Click to send private message to this author Click to view 
this author's profile Click to add 
this author to your buddy list Click to add 
this author to your Ignore list Wed May 01st 2024, 11:08 PM
Response to Original message
Advertisements [?]
 Top

Home » Discuss » DU Groups » Computers & Internet » Open Source and Free Software Group Donate to DU

Powered by DCForum+ Version 1.1 Copyright 1997-2002 DCScripts.com
Software has been extensively modified by the DU administrators


Important Notices: By participating on this discussion board, visitors agree to abide by the rules outlined on our Rules page. Messages posted on the Democratic Underground Discussion Forums are the opinions of the individuals who post them, and do not necessarily represent the opinions of Democratic Underground, LLC.

Home  |  Discussion Forums  |  Journals |  Store  |  Donate

About DU  |  Contact Us  |  Privacy Policy

Got a message for Democratic Underground? Click here to send us a message.

© 2001 - 2011 Democratic Underground, LLC