Discussion:
Mac or Linux users...
Add Reply
Steve Carroll
2025-01-24 17:03:52 UTC
Reply
Permalink
What happens when you run this:

awk '{print "\"" $0 "\","}' /usr/share/dict/words
(... or wherever your dictionary lives if you have one)

Why do I ask, you ask?

All will be revealed!
Apd
2025-01-25 12:51:57 UTC
Reply
Permalink
Post by Steve Carroll
awk '{print "\"" $0 "\","}' /usr/share/dict/words
(... or wherever your dictionary lives if you have one)
I expect you want results from a modern version, but it works in Snow
Leopard with that path ("words" is a link pointing to a file named
"web2") to format the list as required:

"A",
"a",
"aa",
"aal",
... etc.

Note the capitalisation which occurs throughout the dictionary as it
containes proper nouns (Christian names, countries, states, etc.). Any
duplication when case-converted doesn't matter (Map.set() creates or
updates). Your dictionary contains similar proper nouns but everything
is lower case. To be consistent, the setup() function should be
modified so I'll post it in the anagramage thread.


BTW, this is the "readme" file:


# $NetBSD: README,v 1.2 1997/03/26 07:14:32 mikel Exp $
# @(#)README 8.1 (Berkeley) 6/5/93

WEB ---- (introduction provided by ***@riacs) -------------------------

Welcome to web2 (Webster's Second International) all 234,936 words worth.
The 1934 copyright has elapsed, according to the supplier. The
supplemental 'web2a' list contains hyphenated terms as well as assorted
noun and adverbial phrases. The wordlist makes a dandy 'grep' victim.

-- James A. Woods {ihnp4,hplabs}!ames!jaw (or ***@riacs)
Steve Carroll
2025-01-25 17:11:48 UTC
Reply
Permalink
Post by Apd
Post by Steve Carroll
awk '{print "\"" $0 "\","}' /usr/share/dict/words
(... or wherever your dictionary lives if you have one)
I expect you want results from a modern version, but it works in Snow
Leopard with that path ("words" is a link pointing to a file named
"A",
"a",
"aa",
"aal",
... etc.
Note the capitalisation which occurs throughout the dictionary as it
containes proper nouns (Christian names, countries, states, etc.). Any
duplication when case-converted doesn't matter (Map.set() creates or
updates). Your dictionary contains similar proper nouns but everything
is lower case. To be consistent, the setup() function should be
modified so I'll post it in the anagramage thread.
# $NetBSD: README,v 1.2 1997/03/26 07:14:32 mikel Exp $
Welcome to web2 (Webster's Second International) all 234,936 words worth.
The 1934 copyright has elapsed, according to the supplier. The
supplemental 'web2a' list contains hyphenated terms as well as assorted
noun and adverbial phrases. The wordlist makes a dandy 'grep' victim.
"To allow for dictionaries with capitalisation and be consistent with
what we already have..."

Being that we don't know what anyone's dictionary will contain, allowing
for caps is a good idea but the problem is they won't be displayed.
Proposal (that shouldn't slow things down too much (yes, I know, *never*
tamper with Yoda-speed ;) ...

We lowercase everything upon initial Map construction, adding a '+'
symbol (or whatever char *isn't* in the dict) to the word that contains
a cap. Then, upon reconstruction (of *far* fewer words as this will only
affect the output), we look for the 'tagged' words and capitalize them
on the way 'out'. Just a thought...
FromTheRafters
2025-01-25 17:19:27 UTC
Reply
Permalink
Post by Steve Carroll
Post by Apd
Post by Steve Carroll
awk '{print "\"" $0 "\","}' /usr/share/dict/words
(... or wherever your dictionary lives if you have one)
I expect you want results from a modern version, but it works in Snow
Leopard with that path ("words" is a link pointing to a file named
"A",
"a",
"aa",
"aal",
... etc.
Note the capitalisation which occurs throughout the dictionary as it
containes proper nouns (Christian names, countries, states, etc.). Any
duplication when case-converted doesn't matter (Map.set() creates or
updates). Your dictionary contains similar proper nouns but everything
is lower case. To be consistent, the setup() function should be
modified so I'll post it in the anagramage thread.
# $NetBSD: README,v 1.2 1997/03/26 07:14:32 mikel Exp $
Welcome to web2 (Webster's Second International) all 234,936 words worth.
The 1934 copyright has elapsed, according to the supplier. The
supplemental 'web2a' list contains hyphenated terms as well as assorted
noun and adverbial phrases. The wordlist makes a dandy 'grep' victim.
"To allow for dictionaries with capitalisation and be consistent with
what we already have..."
Being that we don't know what anyone's dictionary will contain, allowing
for caps is a good idea but the problem is they won't be displayed.
Proposal (that shouldn't slow things down too much (yes, I know, *never*
tamper with Yoda-speed ;) ...
We lowercase everything upon initial Map construction, adding a '+'
symbol (or whatever char *isn't* in the dict) to the word that contains
a cap. Then, upon reconstruction (of *far* fewer words as this will only
affect the output), we look for the 'tagged' words and capitalize them
on the way 'out'. Just a thought...
I don't suppose that any C+amelC+ase words would ever come into play.
Mike Easter
2025-01-25 19:01:42 UTC
Reply
Permalink
Post by FromTheRafters
I don't suppose that any C+amelC+ase words would ever come into play.
This addressed to the 'thread', not just FTR.

I just looked up the completely different definitions of 'dictionary'
(not in play here) and wordlist, very much in play, which contains
everything from words to terms to acronyms, such as 'red blood cell',
erythrocyte, RBC examples.

A comprehensive 'word list' incl a huge range of types of terms, is very
inclusive. I would say that most wordlists aren't that. Comprehensive.
Mainly because 'term' is so vast.

Perhaps we shouldn't include 'terms' which aren't all 'words' in the
conventional sense, in a 'word' list.

I might add that the wp discussion of camel case is WAYYYY more
comprehensive than I expected.
--
Mike Easter
Mike Easter
2025-01-25 19:06:51 UTC
Reply
Permalink
Post by Mike Easter
Post by FromTheRafters
I don't suppose that any C+amelC+ase words would ever come into play.
I might add that the wp discussion of camel case is WAYYYY more
comprehensive than I expected.
Here's a tidbit from its 'chemistry set' :-)
Post by Mike Easter
The first systematic and widespread use of medial capitals for
technical purposes was the notation for chemical formulas invented by
the Swedish chemist Jacob Berzelius in 1813.
ie NaCl

Now, kiddies, does NaCl belong in a really good wordlist? Of course it
does.
--
Mike Easter
FromTheRafters
2025-01-25 19:12:40 UTC
Reply
Permalink
Post by Mike Easter
Post by Mike Easter
Post by FromTheRafters
I don't suppose that any C+amelC+ase words would ever come into play.
I might add that the wp discussion of camel case is WAYYYY more
comprehensive than I expected.
Here's a tidbit from its 'chemistry set' :-)
Post by Mike Easter
The first systematic and widespread use of medial capitals for
technical purposes was the notation for chemical formulas invented by
the Swedish chemist Jacob Berzelius in 1813.
ie NaCl
Now, kiddies, does NaCl belong in a really good wordlist? Of course it does.
You might like to see this too,

https://en.wikipedia.org/wiki/Tall_Man_lettering
Mike Easter
2025-01-25 19:35:23 UTC
Reply
Permalink
Post by FromTheRafters
You might like to see this too,
https://en.wikipedia.org/wiki/Tall_Man_lettering
Yeah; that came up in the wp CC article.
--
Mike Easter
Mike Easter
2025-01-25 19:40:50 UTC
Reply
Permalink
Post by Mike Easter
Post by FromTheRafters
You might like to see this too,
https://en.wikipedia.org/wiki/Tall_Man_lettering
Yeah; that came up in the wp CC article.
I might add that the 'question' of *WHO* makes those errors is fodder
for discussion.

In the old days, a lot of doctor-things were /written/ by the
notoriously poor handwriting doctors. Plenty of room for error there
even w/o the similarity issue. Many doctors write so poorly that the
interpretation might not even *resemble* the intended drug or dose.

Once things started being digital and 'typewritten', then the errors
were rarely the doc or even the pharmacist, who are much more sensitive
to similar names, but dispensers such as nurses. The tallman business
was important for them.
--
Mike Easter
Steve Carroll
2025-01-25 19:54:15 UTC
Reply
Permalink
Post by Mike Easter
Post by Mike Easter
Post by FromTheRafters
I don't suppose that any C+amelC+ase words would ever come into play.
I might add that the wp discussion of camel case is WAYYYY more
comprehensive than I expected.
Here's a tidbit from its 'chemistry set' :-)
Post by Mike Easter
The first systematic and widespread use of medial capitals for
technical purposes was the notation for chemical formulas invented by
the Swedish chemist Jacob Berzelius in 1813.
ie NaCl
Now, kiddies, does NaCl belong in a really good wordlist? Of course it
does.
Then on to the next proposal:

235800: {"zygophyceae" => Map(9)}
key: "zygophyceae"
value: Map(9)
0: {"Z" => 1}
1: {"y" => 2}
2: {"g" => 1}
3: {"o" => 1}
4: {"p" => 1}
5: {"h" => 1}
6: {"c" => 1}
7: {"e" => 2}
8: {"a" => 1}
9: {t: => 04}
size: 9

... where 04 means positions 0 and 4 get 'tagged'.

The word: "ZygoPhyceae" is for demonstration purposes, in my words.js, a
flat (un-nested) JS array...

dict = [
"A",
"a",
... // snip
"Zyzomys",
"Zyzzogeton"
]


... it appears as "Zygophyceae".


To clarify: The 'Map' data structure above gets created from the 'dict'
array. I'm proposing to add 'element 9', where 't' reflects the letters
that get 'tagged' for capitalization. Then, 't' is utilized during
reconstruction to replace the caps that were removed when the dict array
was turned into a 'Map' object.
Apd
2025-01-25 20:12:54 UTC
Reply
Permalink
Post by Mike Easter
Post by FromTheRafters
I don't suppose that any C+amelC+ase words would ever come into play.
Something to consider. Hyphenated words are already allowed.
Post by Mike Easter
Now, kiddies, does NaCl belong in a really good wordlist? Of course it
does.
This exercise is to duplicate and perhaps improve the online jumble
helper you showed. It excludes proper nouns and chemical symbols. My
Mac dictionary does have some proper nouns and chemical names but not
their symbols.

The point of this thread is for Mac/*nix users to check SC's command
works for them so instructions could be given on how to create their
own dictionaries. Perhaps they might also grep their dictionaries to
see if their favourite words are included.
FromTheRafters
2025-01-25 22:47:56 UTC
Reply
Permalink
Post by Apd
Post by FromTheRafters
I don't suppose that any C+amelC+ase words would ever come into play.
Something to consider. Hyphenated words are already allowed.
TomTom comes to mind, I hardly ever see it not camelcased.

[...]
Apd
2025-01-25 23:44:58 UTC
Reply
Permalink
Post by FromTheRafters
Post by Apd
Post by FromTheRafters
I don't suppose that any C+amelC+ase words would ever come into play.
Something to consider. Hyphenated words are already allowed.
TomTom comes to mind, I hardly ever see it not camelcased.
tom-tom - A small joined pair of drums, beaten with the hands.
tomtom - Alternative form of tom-tom.
TomTom - A particular brand of GPS system.

A small tweak and I can display any-case words in a dictionary. It
will consider the hyphenated version different but there currently
can be only one of different cases.

<Loading Image...>
Apd
2025-01-26 00:11:53 UTC
Reply
Permalink
Post by Apd
A small tweak and I can display any-case words in a dictionary. It
will consider the hyphenated version different but there currently
can be only one of different cases.
It's better than I thought - all case variations are being stored.
Steve Carroll
2025-01-26 15:41:07 UTC
Reply
Permalink
Post by Apd
Post by FromTheRafters
Post by Apd
Post by FromTheRafters
I don't suppose that any C+amelC+ase words would ever come into play.
Something to consider. Hyphenated words are already allowed.
TomTom comes to mind, I hardly ever see it not camelcased.
tom-tom - A small joined pair of drums, beaten with the hands.
tomtom - Alternative form of tom-tom.
TomTom - A particular brand of GPS system.
OLAS: Tangentially speaking of tom toms:



Much the same here!
Apd
2025-01-26 17:25:44 UTC
Reply
Permalink
Post by Steve Carroll
Post by Apd
Post by FromTheRafters
TomTom comes to mind, I hardly ever see it not camelcased.
tom-tom - A small joined pair of drums, beaten with the hands.
tomtom - Alternative form of tom-tom.
TomTom - A particular brand of GPS system.
http://youtu.be/OKkiIEwQ1D4
Much the same here!
No bands in the charts. How strange, I never realized it and wonder
why? Richard Osman is fairly high profile over here. He hosts a quiz
and appears in various other TV shows. He's very quick witted.
David
2025-01-25 23:51:34 UTC
Reply
Permalink
Post by Apd
Post by FromTheRafters
I don't suppose that any C+amelC+ase words would ever come into play.
Something to consider. Hyphenated words are already allowed.
TomTom comes to mind, I hardly ever see it not *camelcased*.
[...]
I learn something new here almost every day!

https://en.wikipedia.org/wiki/Camel_case
--
David
Steve Carroll
2025-01-26 15:42:25 UTC
Reply
Permalink
Post by David
Post by Apd
Post by FromTheRafters
I don't suppose that any C+amelC+ase words would ever come into play.
Something to consider. Hyphenated words are already allowed.
TomTom comes to mind, I hardly ever see it not *camelcased*.
[...]
I learn something new here almost every day!
It won't kill you!
Post by David
https://en.wikipedia.org/wiki/Camel_case
Look up: Pascal case (or 'upper camel case', if you can find it).
FromTheRafters
2025-01-26 19:55:06 UTC
Reply
Permalink
Post by Steve Carroll
Post by David
Post by Apd
Post by FromTheRafters
I don't suppose that any C+amelC+ase words would ever come into play.
Something to consider. Hyphenated words are already allowed.
TomTom comes to mind, I hardly ever see it not *camelcased*.
[...]
I learn something new here almost every day!
It won't kill you!
Post by David
https://en.wikipedia.org/wiki/Camel_case
Look up: Pascal case (or 'upper camel case', if you can find it).
AI Overview:

Upper camel case is a naming convention that capitalizes the first
letter of each word in a phrase. It's a variation of the CamelCase
naming convention, which also capitalizes the first letter of each
word, but may or may not capitalize the first letter.

Explanation

What it's used for: Upper camel case is often used for class names.

How it's used: For example, "MyXmlParser" and "MyXMLParser" are both
valid upper camel case forms.
Why it's used: Camel case is often used in programming and computer
naming conventions because it's easier to read and maintain code when
names are consistent.

Related terms

Lower camel case: Also known as dromedary case, this is when the first
letter of the phrase is lowercase.

Pascal case: This is when the first letter of each word in the phrase
is capitalized, including the first letter of the first word.

Medial capitals: This is the original name for the CamelCase naming
convention.

Other naming conventions snake case and kebab case.
Apd
2025-01-25 20:09:44 UTC
Reply
Permalink
Post by Steve Carroll
"To allow for dictionaries with capitalisation and be consistent with
what we already have..."
Being that we don't know what anyone's dictionary will contain, allowing
for caps is a good idea but the problem is they won't be displayed.
Proposal (that shouldn't slow things down too much (yes, I know, *never*
tamper with Yoda-speed ;) ...
We lowercase everything upon initial Map construction, adding a '+'
symbol (or whatever char *isn't* in the dict) to the word that contains
a cap. Then, upon reconstruction (of *far* fewer words as this will only
affect the output), we look for the 'tagged' words and capitalize them
on the way 'out'. Just a thought...
At the moment I want to leave the dict array in its original form but
lowercase everything else. I'm not keen on reconstruction after the
fact.
Steve Carroll
2025-01-25 20:28:40 UTC
Reply
Permalink
Post by Apd
Post by Steve Carroll
"To allow for dictionaries with capitalisation and be consistent with
what we already have..."
Being that we don't know what anyone's dictionary will contain, allowing
for caps is a good idea but the problem is they won't be displayed.
Proposal (that shouldn't slow things down too much (yes, I know, *never*
tamper with Yoda-speed ;) ...
We lowercase everything upon initial Map construction, adding a '+'
symbol (or whatever char *isn't* in the dict) to the word that contains
a cap. Then, upon reconstruction (of *far* fewer words as this will only
affect the output), we look for the 'tagged' words and capitalize them
on the way 'out'. Just a thought...
At the moment I want to leave the dict array in its original form but
lowercase everything else. I'm not keen on reconstruction after the
fact.
I know, I'm just tossing out ideas... it's fine (and fast) the way it
is.
Gremlin
2025-01-28 06:19:09 UTC
Reply
Permalink
Post by Steve Carroll
awk '{print "\"" $0 "\","}' /usr/share/dict/words
(... or wherever your dictionary lives if you have one)
:) I got a screen full of this and I can't go back far enough to copy
paste from where I entered the command via terminal. So here's a snippit
of the bottom:

"zones",
"zoning",
"zonked",
"zoo",
"zoological",
"zoologist",
"zoologist's",
"zoologists",
"zoology",
"zoology's",
"zoom",
"zoomed",
"zooming",
"zoom's",
"zooms",
"zoo's",
"zoos",
"zorch",
"zucchini",
"zucchini's",
"zucchinis",
"zwieback",
"zwieback's",
"zygote",
"zygote's",
"zygotes",
***@Gremlin:/

I'm on the latest version of MX Linux as of this post date. Which btw, has
some issues. After applying 208 updates, My Thunar interface is a little
buggy now (crashes out, no warning) along with my dock module. If I'm
running VLC and I just double click another file while one is playing,
there's a very good chance that applet will crash. Grrr.. I'm hoping this
gets resolved shortly. I think i'll be loading another laptop I acquired
fresh from the latest ISO and hopefully it won't become.. slightly
unstable as this one has after the latest updates have been applied. I
love this distro, but on occasion, some updates do break things. But,
they're usually very good about fixing it.
Post by Steve Carroll
Why do I ask, you ask?
All will be revealed!
Looking forward to it.
--
I don't need no Dr. All I need...is my lawyer.
pothead
2025-01-28 23:15:36 UTC
Reply
Permalink
Post by Gremlin
Post by Steve Carroll
awk '{print "\"" $0 "\","}' /usr/share/dict/words
(... or wherever your dictionary lives if you have one)
:) I got a screen full of this and I can't go back far enough to copy
paste from where I entered the command via terminal. So here's a snippit
"zones",
"zoning",
"zonked",
"zoo",
"zoological",
"zoologist",
"zoologist's",
"zoologists",
"zoology",
"zoology's",
"zoom",
"zoomed",
"zooming",
"zoom's",
"zooms",
"zoo's",
"zoos",
"zorch",
"zucchini",
"zucchini's",
"zucchinis",
"zwieback",
"zwieback's",
"zygote",
"zygote's",
"zygotes",
I'm on the latest version of MX Linux as of this post date. Which btw, has
some issues. After applying 208 updates, My Thunar interface is a little
buggy now (crashes out, no warning) along with my dock module. If I'm
running VLC and I just double click another file while one is playing,
there's a very good chance that applet will crash. Grrr.. I'm hoping this
gets resolved shortly. I think i'll be loading another laptop I acquired
fresh from the latest ISO and hopefully it won't become.. slightly
unstable as this one has after the latest updates have been applied. I
love this distro, but on occasion, some updates do break things. But,
they're usually very good about fixing it.
I too had issues with the latest update to MXLinux.
Was getting file system errors on boot.
Since the system would not boot to a login screen I had to fun fsck on /dev/sda1 which fixed the issue.
--
pothead

Why did Joe Biden pardon his family?
Read below to learn the reason.
The Biden Crime Family Timeline here:
https://oversight.house.gov/the-bidens-influence-peddling-timeline/
Gremlin
2025-02-17 01:25:46 UTC
Reply
Permalink
[snip]
Post by pothead
Post by Gremlin
I'm on the latest version of MX Linux as of this post date. Which btw,
has some issues. After applying 208 updates, My Thunar interface is a
little buggy now (crashes out, no warning) along with my dock module.
If I'm running VLC and I just double click another file while one is
playing, there's a very good chance that applet will crash. Grrr.. I'm
hoping this gets resolved shortly. I think i'll be loading another
laptop I acquired fresh from the latest ISO and hopefully it won't
become.. slightly unstable as this one has after the latest updates
have been applied. I love this distro, but on occasion, some updates do
break things. But, they're usually very good about fixing it.
I too had issues with the latest update to MXLinux.
This usually doesn't happen. Their QC is usually top notch. But, I
understand there's a pile of hardware it's used on that might/might not be
okay with things on occasion. It's software, and, I certainly understand
those issues. :)
Post by pothead
Was getting file system errors on boot.
I was lucky in that respect. No issues on the file system level. But damn if
the DE wasn't slightly unstable. The updates it applied today seems to have
resolved that for me, though. So far. :) Yea, I do the updates when I am
using the laptop if any are available, so you can see by posting time/date
it's been awhile.
Post by pothead
Since the system would not boot to a login screen I had to fun fsck on
/dev/sda1 which fixed the issue.
Grrrr. Always something, isn't it? :)
--
I don't need no Dr. All I need...is my lawyer.
Loading...