Suo Lu

Thinking in Data

| | email

Some Study References

Headline

Two years not update, time spent at work, read a Master degree, and get a son!

I will restart writing, here beginning with some valuable links not directly refer in my thesis.

Cloud Storage and Availability

AWS S3 Down. 2017.2.
http://www.businessinsider.com/aws-outage-hurt-internet-retailers-except-amazon-2017-3

AWS S3 Down. 2017.9.
https://news.ycombinator.com/item?id=15251469

阿里云事故. 2018.6.
https://xw.qq.com/tech/20180627029994/TEC2018062702999400

GCS Down. 2018.7.
https://status.cloud.google.com/incident/cloud-networking/18012

Azure Down. 2018.9.
https://www.datacenterdynamics.com/news/microsoft-azure-suffers-outage-after-cooling-issue/

Snapchat down
https://www.recode.net/2017/2/7/14526832/snap-ipo-snapchat-s1-wall-street-business-google-cloud

Google Cloud vs AWS
https://kinsta.com/blog/google-cloud-vs-aws/

Cloud Service Map For AWS and Azure
https://azure.microsoft.com/en-us/blog/cloud-service-map-for-aws-and-azure-available-now/

The Limits of the CAP Theorem
https://www.cockroachlabs.com/blog/limits-of-the-cap-theorem/

Cloud Storage Durability
https://www.backblaze.com/blog/cloud-storage-durability/

Object storage vs Block storage services
https://www.digitalocean.com/community/tutorials/object-storage-vs-block-storage-services

Data Redundancy and Consistency

Raft Faq
https://pdos.csail.mit.edu/6.824/papers/raft-faq.txt

Raft算法的理解
https://mp.weixin.qq.com/s/t1f1AmWw6ir7rWTiNLHyBQ

分布式系统的一致性协议之 2PC 和 3PC
http://matt33.com/2018/07/08/distribute-system-consistency-protocol/

Apache Zookeeper vs Etcd3
https://dzone.com/articles/apache-zookeeper-vs-etcd3

分布式锁
http://www.54tianzhisheng.cn/2018/04/24/Distributed_lock/

Building distributed locks with the Dynamodb lock client
https://aws.amazon.com/blogs/database/building-distributed-locks-with-the-dynamodb-lock-client/

Programmability consistency choices
https://docs.microsoft.com/en-us/azure/cosmos-db/introduction
http://docs.datastax.com/en/archived/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html

Distribute system consistency protocol
http://matt33.com/2018/07/08/distribute-system-consistency-protocol/
https://stackoverflow.com/questions/12346326/cap-theorem-availability-and-partition-tolerance

Snitch putting consistency back s3
https://blog.box.com/blog/snitch-putting-consistency-back-s3/

Erasure Coding
https://www.akalin.com/intro-erasure-codes
http://www.nephostechnologies.com/blog/object-storage-and-erasure-coding-what%E2%80%99s-it-all-about
http://searchstorage.techtarget.com/definition/erasure-coding
http://www.computerweekly.com/feature/Erasure-coding-versus-RAID-as-a-data-protection-method
https://blogs.vmware.com/virtualblocks/2016/02/12/the-use-of-erasure-coding-in-virtual-san-6-2/
http://www.snia.org/sites/default/files/SDC15_presentations/datacenter_infra/Shenoy_The_Pros_and_Cons_of_Erasure_v3-rev.pdf
http://www.snia.org/sites/default/files/Wesley_Leggette-Next_Gen-Erasurev4_EDIT.pdf
http://ece.iisc.ernet.in/~vijay/pdfs/openday2017.pdf
https://blog.cloudera.com/blog/2015/09/introduction-to-hdfs-erasure-coding-in-apache-hadoop/
http://www.taocloudx.com/index.php?a=shows&catid=4&id=104
http://www.snia.org/sites/default/files/Wesley_Leggette-Next_Gen-Erasurev4_EDIT.pdf
https://blogs.dropbox.com/tech/2018/06/extending-magic-pocket-innovation-with-the-first-petabyte-scale-smr-drive-deployment/

Time Series Forecasting

Algorithms and methods
https://cs.stackexchange.com/questions/13937/which-machine-learning-algorithms-can-be-used-for-time-series-forecasts
http://www.ulb.ac.be/di/map/gbonte/ftp/time_ser.pdf
https://arxiv.org/abs/1506.00019
https://www.analyticsvidhya.com/blog/2018/02/time-series-forecasting-methods/
https://stats.stackexchange.com/questions/23864/what-common-forecasting-models-can-be-seen-as-special-cases-of-arima-models

Time series classification and clustering
https://www.quora.com/What-are-some-time-series-classification-methods
http://alexminnaar.com/time-series-classification-and-clustering-with-python.html
https://stats.stackexchange.com/questions/66027/time-series-classification-very-poor-results

ML Pros and Cons
https://mp.weixin.qq.com/s?__biz=MzI0ODcxODk5OA==&mid=2247485786&idx=1&sn=a5ba6589a75255a169f7fcb3ae10c48c
https://elitedatascience.com/machine-learning-algorithms
https://elitedatascience.com/dimensionality-reduction-algorithms

Time series prediction reviews
https://datascience.stackexchange.com/questions/12721/time-series-prediction-using-arima-vs-lstm
http://www.datasciencecentral.com/profiles/blogs/time-series-analysis-with-generalized-additive-models
https://blog.statsbot.co/time-series-anomaly-detection-algorithms-1cef5519aef2
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3641111/

Time series database
https://blog.timescale.com/what-the-heck-is-time-series-data-and-why-do-i-need-a-time-series-database-dcf3b1b18563
https://blog.timescale.com/timescaledb-vs-6a696248104e

Exp. sharing
http://www.infoq.com/cn/articles/facebook-open-source-mass-forecasting-system-prophet
http://blog.zhanglun.me/2017/06/13/%E5%A4%A7%E8%88%AA%E6%9D%AF%E2%80%9C%E6%99%BA%E9%80%A0%E6%89%AC%E4%B8%AD%E2%80%9D%E7%94%B5%E5%8A%9BAI%E5%A4%A7%E8%B5%9B%E5%8F%82%E8%B5%9B%E7%BB%8F%E9%AA%8C/
http://www.infoq.com/cn/articles/application-of-spark–in-jingdong-supply-chain-forecasting

19 Nov 2018