1
votes

OK, I don't mean for this to be another "how do I get historical stock data" question.

I've already found plenty of readily available sources for that, even neatly exported into direct CSV downloads from the likes of Google Finance and Yahoo and Finviz.

Software I write actively uses handy tools including YahooFinance and Dirk Eddelbuettel's fine Beancounter.

My question is, perhaps especially if anyone's familiar with beancounter, when I query for a single stock's end-of-day prices, how can I make sure all that historical data gets cached (for instance into local text or sqlite files) so as to minimize hitting the web so much and speed up repeat queries overlapping time frames?

Causing a network lookup for today's real-time price if available yet is inevitable, which by itself is fine. My pricehist.sh script combines that with OHLCV prices of a stock N days back, which can also be easily grabbed with eg. wget or curl to the right URL + proper GET variables.

$ ./pricehist.sh VQT 6
VQT  2011-12-14  125.50  125.50  124.43  124.49  20360 
VQT  2011-12-13  128.00  128.00  125.28  125.39  24400 
VQT  2011-12-12  127.50  127.50  126.36  126.66  9100  
VQT  2011-12-09  128.00  128.31  127.82  128.14  12100  

In the instance above I would like pricehist.sh to:

  • look up current market data for today if any, so closing price is actually last price if session not over. DONE
  • Check if rest of requested history for that ticker symbol isn't already stored locally, otherwise:
  • Look up VQT history going back 6 days, store it and print completed table. DONE

Despite all my searching I'm still unsure how to get beancounter to simply output one stock's OHLCV table regardless of portfolio...NOT this output:

$ setup_beancounter -l beancounter.stockdata.sqlite
$ beancounter addstock  VQT ACPW
$ beancounter backpopulate --prevdate '1 month ago' --date 'today'  VQT ACPW 
   Adding VQT from 20111114 to 20111214
   Adding ACPW from 20111114 to 20111214

$ beancounter dayendreport --prevdate '1 month ago' --date 'today'  VQT
## ..... hoping for OHLCV table of VQT only but get this instead:
===============================================================================
Profit / loss         from 12 Dec 2011  to 14 Dec 2011  abs, rel change
-------------------------------------------------------------------------------
Citigroup, Inc. N  USD    2690.00    26.90    2605.00    26.05   -85.00  -3.16%
Exxon Mobil Corpo  USD    6039.75    80.53    5958.00    79.44   -81.75  -1.35%
Google Inc.        USD   15640.75   625.63   15451.75   618.07  -189.00  -1.21%
International Bus  USD    9557.50   191.15    9436.00   188.72  -121.50  -1.27%
-------------------------------------------------------------------------------
Grand Total        USD   33928.00            33450.75           -477.25  -1.41%
===============================================================================

Using the Perl Finance module instead, I just don't see where to activate caching, nor what local data files would be used for the storage. Every call to yahoofinance.rb shows me its web traffic in the squid proxy logs. Thousands of EOD queries, often overlapping, over several processing stages of my engine cause these web lookups to be a major bottleneck.

I figure this wheel's got to already be invented, and I'll use a full RDBMS backend if I have to, though that would mean taxing my Ruby, Perl and Bash scripts with MySQL clients.

By comparison, what I built for the Polish stock market upkeeping only text files for each of the roughly 700 symbols and processed only by grep, sed and awk is incredibly fast--almost instantaneous despite thousands of cycles of looping the equivalent scripts--so my software's daily premarket crunching for the GPW exchange is done in minutes not hours.

1

1 Answers

0
votes

I'm on to something.

The SQLite file Beancounter uses includes a table with these OHLCV daily records:

Table: stockprices
#   |symbol|date    |previous_close|day_open        |day_low         |day_high        |day_close|day_change|bid   |ask   |volume    
----+------+--------+--------------+----------------+----------------+----------------+---------+----------+------+------+----------
1   |C     |20111214|26.9          |26.45           |25.92           |27.19           |26.05    |-0.85     |      |      |68284528  
2   |IBM   |20111214|191.15        |189.84          |188.0           |190.28          |188.72   |-2.43     |      |      |5031717   
3   |GOOG  |20111214|625.63        |621.49          |612.49          |624.32          |618.07   |-7.56     |617.59|618.95|3892889   
31  |HD    |20111214|39.51         |39.45           |38.84           |39.57           |39.14    |-0.37     |      |      |12021824  
32  |IBM   |20111213|              |193.46          |190.64          |194.3           |191.15   |          |      |      |5008400   
33  |IBM   |20111212|              |193.64          |191.22          |193.9           |192.18   |          |      |      |3796100   
34  |IBM   |20111209|              |192.91          |192.4           |194.87          |194.56   |          |      |      |4847900   

So next, I'll rewrite my process to use beancounter to add & update stocks as needed, and since its EOD data reporting is lacking or not working for me, I'm learn the appropriate sqlite commands to pull it directly from the db, massage the output as needed with sed/awk, and post back here.

At the end I should have a simple pricehist.sh query that combines today's live prices(if available) + cached historical end-of-day records.