2
votes

I've been trying to find an idiomatic way to convert user data to valid keywords in clojure.

A possible use case for this is when reading in an excel spreadsheet, I would like to dynamically build a map for each row besides the first where the first row contains headers that will be keywords . I need to account for the headers possibly containing spaces or other invalid characters. I have read that the keyword function will not complaim and will give you an invalid key that may be hard to work with or even harmful.

I could manually make the conversions or possibly use a framework like slugger to do this, but i wanted to know if there was anything already built-in that could handle this.

Also, I have read that at one point creating too many keys could overload the heap, but that was from 2010 and it may have been resolved in 1.3. Would it just be best for me to create my hash-map with string key instead of keywords? I have read that doing so is not idiomatic.

2
Is there good reason not to use the strings with offending characters themselves as keys rather than converting to keywords? Seems like you could avoid the whole issue and keep the keys the same as the column names. - A. Webb
As overthink stated below it doesn't complain about spaces so (keyword "hi there") gets converted to :hi there which is not easy to work with as you can't directly call it, you have to always use (keyword "hi there"), :hi there throws a runtime exception - user1371281
Serializing and reloading strings is a solved problem, and making the keys keywords is a pretty small bonus compared with the large chore of making readable keywords for each string. Unless you don't need to serialize! If you aren't going to store and reload the keywords, then just make keywords out of the strings. (keyword "this is a string") is totally valid, it just isn't readable in its default printed form. - noisesmith
To be clear: the keyword with a space in it is not invalid. It is fine. It just is not readable. - noisesmith
In the case of a space you are correct, but keyword allows invalid characters which I may also encounter, like % the soace was an example of something i might encounter, but it is difficult to work with even if it may be valid. I have been leaning towards slugger which would take "% Increase" and convert it to "percent-increase" which would make a valid key, but I wanted to make sure I wasn't reinventing the wheel. - user1371281

2 Answers

1
votes

Unless you have good reason for doing otherwise, simply use the string itself as a key.

user=> (def my-db (atom {}))
#'user/my-db
user=> (swap! my-db assoc "New York" 1)
{"New York" 1}
user=> (swap! my-db assoc "Los Angeles" 2)
{"Los Angeles" 2, "New York" 1}
user=> (do (print "Which city do you want to rank?\n =>") 
           (flush) 
           (@my-db (read-line)))
Which city do you want to rank?
 =>New York
1

If you encode/keywordize your map, you'll instead have to encode/keywordize or stringify/decode on each interaction with the user's conventions.

0
votes

Hm, it does appear that keyword will spit out unreadable keywords:

user=> (keyword "foo bar")
:foo bar
user=> (keyword "foo:")
:foo:

Neither of these can be read in again.

I would just write a small function to clean your input (rules here) before passing it to the keyword function.