Add go notes
This commit is contained in:
@@ -2,6 +2,15 @@
|
||||
title: "Golang"
|
||||
date: 2023-05-08T09:54:48-04:00
|
||||
draft: false
|
||||
references:
|
||||
- title: "Why David Yach Loves Go"
|
||||
url: https://cloud.google.com/blog/products/application-modernization/why-david-yach-loves-go
|
||||
date: 2023-06-13T14:51:05Z
|
||||
file: cloud-google-com-windxx.txt
|
||||
- title: "One process programming notes (with Go and SQLite)"
|
||||
url: https://crawshaw.io/blog/one-process-programming-notes
|
||||
date: 2023-06-13T14:49:51Z
|
||||
file: crawshaw-io-k5slfj.txt
|
||||
---
|
||||
|
||||
I find [Go][1] really compelling, even though it's not super applicable to my job. When evaluating a new tool, I find I'm weirdly biased to things written in Go.
|
||||
@@ -51,7 +60,9 @@ I find [Go][1] really compelling, even though it's not super applicable to my jo
|
||||
* [Standard Go Project Layout][8]
|
||||
* [The files & folders of Go projects][9]
|
||||
* [Why David Yach Loves Go][10]
|
||||
* [One process programming notes (with Go and SQLite)][11]
|
||||
|
||||
[8]: https://github.com/golang-standards/project-layout
|
||||
[9]: https://changelog.com/gotime/278
|
||||
[10]: https://cloud.google.com/blog/products/application-modernization/why-david-yach-loves-go
|
||||
[11]: https://crawshaw.io/blog/one-process-programming-notes
|
||||
|
||||
365
static/archive/cloud-google-com-windxx.txt
Normal file
365
static/archive/cloud-google-com-windxx.txt
Normal file
@@ -0,0 +1,365 @@
|
||||
#[1]home
|
||||
|
||||
[2]Jump to Content
|
||||
|
||||
Cloud
|
||||
(BUTTON)
|
||||
[3]Blog
|
||||
[4]Contact sales [5]Get started for free
|
||||
____________________
|
||||
|
||||
Cloud
|
||||
[6]Blog
|
||||
|
||||
* Solutions & technology
|
||||
+ [7]AI & Machine Learning
|
||||
+ [8]API Management
|
||||
+ [9]Application Development
|
||||
+ [10]Application Modernization
|
||||
+ [11]Chrome Enterprise
|
||||
+ [12]Compute
|
||||
+ [13]Containers & Kubernetes
|
||||
+ [14]Data Analytics
|
||||
+ [15]Databases
|
||||
+ [16]DevOps & SRE
|
||||
+ [17]Maps & Geospatial
|
||||
+ [18]Security & Identity
|
||||
+ [19]Infrastructure
|
||||
+ [20]Infrastructure Modernization
|
||||
+ [21]Networking
|
||||
+ [22]Productivity & Collaboration
|
||||
+ [23]SAP on Google Cloud
|
||||
+ [24]Storage & Data Transfer
|
||||
+ [25]Sustainability
|
||||
* Ecosystem
|
||||
+ [26]IT Leaders
|
||||
+ Industries
|
||||
o [27]Financial Services
|
||||
o [28]Healthcare & Life Sciences
|
||||
o [29]Manufacturing
|
||||
o [30]Media & Entertainment
|
||||
o [31]Public Sector
|
||||
o [32]Retail
|
||||
o [33]Supply Chain
|
||||
o [34]Telecommunications
|
||||
+ [35]Partners
|
||||
+ [36]Startups & SMB
|
||||
+ [37]Training & Certifications
|
||||
+ [38]Inside Google Cloud
|
||||
+ [39]Google Cloud Next & Events
|
||||
+ [40]Google Maps Platform
|
||||
+ [41]Google Workspace
|
||||
* [42]Developers & Practitioners
|
||||
* [43]Transform with Google Cloud
|
||||
|
||||
[44]Contact sales [45]Get started for free
|
||||
____________________
|
||||
|
||||
Application Modernization
|
||||
|
||||
Why I love Go
|
||||
September 12, 2022
|
||||
*
|
||||
*
|
||||
*
|
||||
*
|
||||
|
||||
David Yach
|
||||
|
||||
Director of Engineering at Google Cloud
|
||||
|
||||
Iâve been building software over the last four decades, as a developer,
|
||||
manager and executive in both small and large software companies. I
|
||||
started my career working on commercial compilers, first BASIC and then
|
||||
C. I have written a lot of code in many different languages, and
|
||||
managed teams with even broader language usage.
|
||||
|
||||
I learned Go about 5 years ago when I was CTO at a startup/scaleup. At
|
||||
the time, we were looking to move to a microservice architecture, and
|
||||
that shift gave us the opportunity to consider moving away from the
|
||||
incumbent language (Scala). As I read through the Go tutorials, my
|
||||
compiler-writing background came back to me and I found myself
|
||||
repeatedly thinking âThatâs cool â I know why the Go team did that!â So
|
||||
I got hooked on the language design.
|
||||
|
||||
Learning
|
||||
|
||||
I have worked with many different computer languages over the years, so
|
||||
I was not surprised I could quickly get started writing Go programs
|
||||
after reading through the online documents and tutorials. But then when
|
||||
I saw a new co-op student (a.k.a. intern) learn Go and write a
|
||||
substantial prototype in their first two weeks on the job, it became
|
||||
clear that Go was much easier to learn than many other languages.
|
||||
|
||||
Writing code
|
||||
|
||||
As I started writing my first Go programs, the first thing that struck
|
||||
me was the blazing compiler speed. It was as fast or faster starting my
|
||||
application than many interpreted languages, yet it was a compiled
|
||||
program with a strongly typed language. (I have an affinity for
|
||||
strongly typed languages â I have spent way too much time tracking down
|
||||
obscure issues in my own code in dynamic typed languages, where the
|
||||
same issue would have been a compile error in a strongly typed
|
||||
language.) Even better, in Go I often donât need to declare the type â
|
||||
the compiler figures it out.
|
||||
|
||||
I was impressed with the standard Go library â it included many of the
|
||||
capabilities required by modern applications â things like HTTP
|
||||
support, JSON handling and encryption. Many other languages required
|
||||
you to use a third-party library for these features, and often there
|
||||
were multiple competing libraries to choose from, adding another
|
||||
decision point for the developer. With Go, I could go to the standard
|
||||
library GoDoc and get started right away.
|
||||
|
||||
There were a few other language decisions that I found helpful. One is
|
||||
that the compiler figures out if you are returning a pointer to a
|
||||
local, and behind the scenes allocates the memory rather than using the
|
||||
stack. This prevents bugs, and I find the code more readable.Â
|
||||
|
||||
I also like that you donât declare that you support an interface. I
|
||||
wasnât sure I would like this at first because it isnât obvious if a
|
||||
type implements a particular interface, but I found greater value in
|
||||
the fact that I wasnât dependent on the code author (even if it was
|
||||
me!) to declare that the interface is implemented. This first hit home
|
||||
when I used fmt.Println() and it automatically used the String() method
|
||||
I had implemented even though it hadnât occurred to me that I was
|
||||
implementing the Stringer interface.
|
||||
|
||||
The last feature Iâll note is the ability to do concurrent programming
|
||||
through channels and goroutines. The model is simple to understand yet
|
||||
powerful.
|
||||
|
||||
Reading code
|
||||
|
||||
After writing more Go code and starting to incorporate third party
|
||||
libraries, I had a realization that had never occurred to me before â
|
||||
as a developer, I spend a lot of time reading code. In fact, I probably
|
||||
spend more time reading code than writing it, once you start counting
|
||||
code reviews, debugging, and evaluating third-party libraries.
|
||||
|
||||
What was different about reading Go code? I would summarize it by âit
|
||||
all looks the same.â What do I mean by that? Go format ensures all the
|
||||
braces are in the same spot; capitalized identifiers are exported;
|
||||
there are no implicit conversions, even of internal types; and there is
|
||||
no overloading of operators, functions or methods. That means that with
|
||||
Go code, âwhat you see is what you getâ with no hidden meaning. Of
|
||||
course, it doesnât help me to understand a complicated algorithm, but
|
||||
it does mean that I can concentrate more on that algorithm because I
|
||||
donât have to understand whether â+â is overloaded, for example.
|
||||
|
||||
I was also pleasantly surprised when I used GoDoc on one of my
|
||||
projects, and discovered that I had semi-reasonable documentation
|
||||
without doing anything while writing the code other than adding
|
||||
comments on my functions and methods based on nagging from the IDE I
|
||||
was using. I did spend some time cleaning up the comments after that,
|
||||
but Iâm not sure I would have even started that work if Go hadnât given
|
||||
me a great starting point.
|
||||
|
||||
Testing code
|
||||
|
||||
Go test is part of the standard Go tools and supported by IDEs, making
|
||||
it easy to get started creating unit tests for my code. And like the
|
||||
standard Go library, having a standard way to do tests means I donât
|
||||
have to evaluate external testing frameworks and select one. I can also
|
||||
understand the tests when Iâm evaluating a third party library.
|
||||
|
||||
Even better, the default behavior running package tests in VSCode is to
|
||||
enable Goâs built-in code coverage. I had never taken code coverage
|
||||
seriously working in other languages, partly because it was often
|
||||
difficult to set up. But the immediate feedback (helped by the blazing
|
||||
compile speed) gamified this for me, and I found myself adding tests to
|
||||
increase code coverage (and finding new bugs along the way).
|
||||
|
||||
Go doesnât allow circular dependencies between packages. While this has
|
||||
caused me some rethinking while writing code, I find it makes my
|
||||
testing regimen easier to think about â if I depend on a package, I can
|
||||
rely on that package to have its own tests covering its capabilities.
|
||||
|
||||
Deploying code
|
||||
|
||||
I learned Go at the same time we were migrating towards container-based
|
||||
microservices. In that environment, the fact that Go produces a single,
|
||||
self-contained executable makes it much easier and more efficient to
|
||||
build and manage containers. I can build a container layer with one
|
||||
single file, which is often a single-digit number of MB in size,
|
||||
compared to our prior JVM-based containers which started with hundreds
|
||||
of MB for the Java runtime then another layer for our application. (It
|
||||
is easy to forget how much this overhead ends up costing in production,
|
||||
particularly if you have hundreds or thousands of containers running).
|
||||
|
||||
Second, Go has built-in cross compiling capabilities so our development
|
||||
machines, containers and cloud hardware donât all have to all be on the
|
||||
same processor or operating system. For example, I can use a Linux
|
||||
build machine to produce client executables for Linux, Mac and Windows.
|
||||
Again, this takes away a complicated decision process due to artificial
|
||||
constraints.
|
||||
|
||||
Finally, Go has established a well defined set of principles for
|
||||
versioning and compatibility. While not all pieces of this are
|
||||
enforced, having the principles from an authoritative source helps
|
||||
manage the real life challenges of keeping your software supply chain
|
||||
up to date. For example, it is strongly recommended that breaking
|
||||
changes require a new major version number. While not enforced, it
|
||||
leads the community to call out any open source package that violates
|
||||
this principle.
|
||||
|
||||
What do I miss?
|
||||
|
||||
I did miss generics; thankfully Go 1.18 added support. And I do wish
|
||||
the standard library offered immutable collections (like Scala and
|
||||
other functional languages). Embedding instead of inheritance works
|
||||
pretty much the same in many cases, but requires some deep thinking
|
||||
sometimes.
|
||||
|
||||
My most frequent coding mistake is when I should have used a pointer
|
||||
receiver for a method and didnât, then modify the receiver expecting
|
||||
the changes to be visible when the method returns. The code looks
|
||||
correct, the right values get assigned if I use a debugger to step
|
||||
through or issue prints, but the changes disappear after the method
|
||||
returns. I think I would have preferred if receivers were immutable, it
|
||||
would have caught these errors at compile time, and in the few
|
||||
remaining cases where I wanted to modify the receiver I would have
|
||||
copied it to a local variable.
|
||||
|
||||
In conclusion
|
||||
|
||||
As you can tell, I am a huge fan of Go, from even before I joined
|
||||
Google. I am impressed by the language and ecosystem design, and by the
|
||||
implementation. For me, Go makes me a more productive developer and Iâm
|
||||
more confident in the quality of the code I produce.
|
||||
|
||||
Go, give it a [46]try!
|
||||
|
||||
Posted in
|
||||
* [47]Application Modernization
|
||||
* [48]Application Development
|
||||
* [49]Open Source
|
||||
|
||||
Related articles
|
||||
|
||||
https://storage.googleapis.com/gweb-cloudblog-publish/images/TMI_Blog_h
|
||||
eader_2436x1200_Rnd2.max-700x700.jpg
|
||||
Application Development
|
||||
|
||||
The Modernization Imperative: Shifting left is for suckers. Shift down
|
||||
instead
|
||||
|
||||
By Richard Seroter ⢠5-minute read
|
||||
|
||||
https://storage.googleapis.com/gweb-cloudblog-publish/images/DO_NOT_USE
|
||||
_ps1BuN1.max-700x700.jpg
|
||||
DevOps & SRE
|
||||
|
||||
Config Connector: An easy way to manage your infrastructure in Google Cloud
|
||||
|
||||
By Leonid Yankulin ⢠4-minute read
|
||||
|
||||
https://storage.googleapis.com/gweb-cloudblog-publish/images/DO_NOT_USE
|
||||
_Wfx45fA.max-700x700.jpg
|
||||
Application Modernization
|
||||
|
||||
Realizing cloud value for a render platform at Wayfair â Part 2
|
||||
|
||||
By Jack Brooks ⢠4-minute read
|
||||
|
||||
https://storage.googleapis.com/gweb-cloudblog-publish/images/DO_NOT_USE
|
||||
_Wfx45fA.max-700x700.jpg
|
||||
Application Modernization
|
||||
|
||||
Realizing cloud value for a render platform at Wayfair - Part 1
|
||||
|
||||
By Jack Brooks ⢠4-minute read
|
||||
|
||||
Footer Links
|
||||
|
||||
Follow us
|
||||
|
||||
*
|
||||
*
|
||||
*
|
||||
*
|
||||
*
|
||||
|
||||
* [50]Google Cloud
|
||||
* [51]Google Cloud Products
|
||||
* [52]Privacy
|
||||
* [53]Terms
|
||||
|
||||
* [54]Help
|
||||
* [âªEnglishâ¬_____....]
|
||||
|
||||
References
|
||||
|
||||
Visible links:
|
||||
1. https://cloud.google.com/?lfhs=2
|
||||
2. https://cloud.google.com/blog/#content
|
||||
3. https://cloud.google.com/blog
|
||||
4. https://cloud.google.com/contact/
|
||||
5. https://console.cloud.google.com/freetrial/
|
||||
6. https://cloud.google.com/blog
|
||||
7. https://cloud.google.com/blog/products/ai-machine-learning
|
||||
8. https://cloud.google.com/blog/products/api-management
|
||||
9. https://cloud.google.com/blog/products/application-development
|
||||
10. https://cloud.google.com/blog/products/application-modernization
|
||||
11. https://cloud.google.com/blog/products/chrome-enterprise
|
||||
12. https://cloud.google.com/blog/products/compute
|
||||
13. https://cloud.google.com/blog/products/containers-kubernetes
|
||||
14. https://cloud.google.com/blog/products/data-analytics
|
||||
15. https://cloud.google.com/blog/products/databases
|
||||
16. https://cloud.google.com/blog/products/devops-sre
|
||||
17. https://cloud.google.com/blog/topics/maps-geospatial
|
||||
18. https://cloud.google.com/blog/products/identity-security
|
||||
19. https://cloud.google.com/blog/products/infrastructure
|
||||
20. https://cloud.google.com/blog/products/infrastructure-modernization
|
||||
21. https://cloud.google.com/blog/products/networking
|
||||
22. https://cloud.google.com/blog/products/productivity-collaboration
|
||||
23. https://cloud.google.com/blog/products/sap-google-cloud
|
||||
24. https://cloud.google.com/blog/products/storage-data-transfer
|
||||
25. https://cloud.google.com/blog/topics/sustainability
|
||||
26. https://cloud.google.com/transform
|
||||
27. https://cloud.google.com/blog/topics/financial-services
|
||||
28. https://cloud.google.com/blog/topics/healthcare-life-sciences
|
||||
29. https://cloud.google.com/blog/topics/manufacturing
|
||||
30. https://cloud.google.com/blog/products/media-entertainment
|
||||
31. https://cloud.google.com/blog/topics/public-sector
|
||||
32. https://cloud.google.com/blog/topics/retail
|
||||
33. https://cloud.google.com/blog/topics/supply-chain-logistics
|
||||
34. https://cloud.google.com/blog/topics/telecommunications
|
||||
35. https://cloud.google.com/blog/topics/partners
|
||||
36. https://cloud.google.com/blog/topics/startups
|
||||
37. https://cloud.google.com/blog/topics/training-certifications
|
||||
38. https://cloud.google.com/blog/topics/inside-google-cloud
|
||||
39. https://cloud.google.com/blog/topics/google-cloud-next
|
||||
40. https://cloud.google.com/blog/products/maps-platform
|
||||
41. https://workspace.google.com/blog
|
||||
42. https://cloud.google.com/blog/topics/developers-practitioners
|
||||
43. https://cloud.google.com/transform
|
||||
44. https://cloud.google.com/contact/
|
||||
45. https://console.cloud.google.com/freetrial/
|
||||
46. https://go.dev/tour/list
|
||||
47. https://cloud.google.com/blog/products/application-modernization
|
||||
48. https://cloud.google.com/blog/products/application-development
|
||||
49. https://cloud.google.com/blog/products/open-source
|
||||
50. https://cloud.google.com/
|
||||
51. https://cloud.google.com/products/
|
||||
52. https://myaccount.google.com/privacypolicy?hl=en-US
|
||||
53. https://myaccount.google.com/termsofservice?hl=en-US
|
||||
54. https://support.google.com/
|
||||
|
||||
Hidden links:
|
||||
56. https://cloud.google.com/
|
||||
57. https://cloud.google.com/
|
||||
58. https://twitter.com/intent/tweet?text=Why%20I%20love%20Go%20@googlecloud&url=https://cloud.google.com/blog/products/application-modernization/why-david-yach-loves-go
|
||||
59. https://www.linkedin.com/shareArticle?mini=true&url=https://cloud.google.com/blog/products/application-modernization/why-david-yach-loves-go&title=Why%20I%20love%20Go
|
||||
60. https://www.facebook.com/sharer/sharer.php?caption=Why%20I%20love%20Go&u=https://cloud.google.com/blog/products/application-modernization/why-david-yach-loves-go
|
||||
61. mailto:?subject=Why%20I%20love%20Go&body=Check%20out%20this%20article%20on%20the%20Cloud%20Blog:%0A%0AWhy%20I%20love%20Go%0A%0ALearn%20all%20the%20reasons%20David%20Yach,%20industry%20veteran%20and%20Director%20of%20Engineering%20at%20Google%20Cloud,%20loves%20to%20use%20Go%20for%20software%20development.%0A%0Ahttps://cloud.google.com/blog/products/application-modernization/why-david-yach-loves-go
|
||||
62. https://cloud.google.com/blog/products/application-development/richard-seroter-on-shifting-down-vs-shifting-left
|
||||
63. https://cloud.google.com/blog/products/devops-sre/how-config-connector-compares-for-infrastructure-management
|
||||
64. https://cloud.google.com/blog/products/application-modernization/wayfair-identified-strategies-to-optimize-cloud-workloads-part-2
|
||||
65. https://cloud.google.com/blog/products/application-modernization/wayfair-identified-strategies-to-optimize-cloud-workloads-part-1
|
||||
66. https://www.twitter.com/googlecloud
|
||||
67. https://www.youtube.com/googlecloud
|
||||
68. https://www.linkedin.com/showcase/google-cloud
|
||||
69. https://www.instagram.com/googlecloud/
|
||||
70. https://www.facebook.com/googlecloud/
|
||||
71. https://cloud.google.com/
|
||||
574
static/archive/crawshaw-io-k5slfj.txt
Normal file
574
static/archive/crawshaw-io-k5slfj.txt
Normal file
@@ -0,0 +1,574 @@
|
||||
#[1]crawshaw.io atom feed
|
||||
|
||||
One process programming notes (with Go and SQLite)
|
||||
|
||||
2018 July 30
|
||||
|
||||
Blog-ified version of a talk I gave at [2]Go Northwest.
|
||||
|
||||
This content covers my recent exploration of writing internet services,
|
||||
iOS apps, and macOS programs as an indie developer.
|
||||
|
||||
There are several topics here that should each have their own blog
|
||||
post. But as I have a lot of programming to do I am going to put these
|
||||
notes up as is and split the material out some time later.
|
||||
|
||||
My focus has been on how to adapt the lessons I have learned working in
|
||||
teams at Google to a single programmer building small business work.
|
||||
There are many great engineering practices in Silicon Valleyʼs big
|
||||
companies and well-capitalized VC firms, but one person does not have
|
||||
enough bandwidth to use them all and write software. The exercise for
|
||||
me is: what to keep and what must go.
|
||||
|
||||
If I have been doing it right, the technology and techniques described
|
||||
here will sound easy. I have to fit it all in my head while having
|
||||
enough capacity left over to write software people want. Every extra
|
||||
thing has great cost, especially rarely touched software that comes
|
||||
back to bite in the middle of the night six months later.
|
||||
|
||||
Two key technologies I have decided to use are Go and SQLite.
|
||||
|
||||
A brief introduction to SQLite
|
||||
|
||||
SQLite is an implementation of SQL. Unlike traditional database
|
||||
implementations like PostgreSQL or MySQL, SQLite is a self-contained C
|
||||
library designed to be embedded into programs. It has been built by D.
|
||||
Richard Hipp since its release in 2000, and in the past 18 years other
|
||||
open source contributors have helped. At this point it has been around
|
||||
most of the time I have been programming and is a core part of my
|
||||
programming toolbox.
|
||||
|
||||
Hands-on with the SQLite command line tool
|
||||
|
||||
Rather than talk through SQLite in the abstract, let me show it to you.
|
||||
|
||||
A kind person on Kaggle has [3]provided a CSV file of the plays of
|
||||
Shakespeare. Letʼs build an SQLite database out of it.
|
||||
$ head shakespeare_data.csv
|
||||
"Dataline","Play","PlayerLinenumber","ActSceneLine","Player","PlayerLine"
|
||||
"1","Henry IV",,,,"ACT I"
|
||||
"2","Henry IV",,,,"SCENE I. London. The palace."
|
||||
"3","Henry IV",,,,"Enter KING HENRY, LORD JOHN OF LANCASTER, the EARL of WESTMOR
|
||||
ELAND, SIR WALTER BLUNT, and others"
|
||||
"4","Henry IV","1","1.1.1","KING HENRY IV","So shaken as we are, so wan with car
|
||||
e,"
|
||||
"5","Henry IV","1","1.1.2","KING HENRY IV","Find we a time for frighted peace to
|
||||
pant,"
|
||||
"6","Henry IV","1","1.1.3","KING HENRY IV","And breathe short-winded accents of
|
||||
new broils"
|
||||
"7","Henry IV","1","1.1.4","KING HENRY IV","To be commenced in strands afar remo
|
||||
te."
|
||||
"8","Henry IV","1","1.1.5","KING HENRY IV","No more the thirsty entrance of this
|
||||
soil"
|
||||
"9","Henry IV","1","1.1.6","KING HENRY IV","Shall daub her lips with her own chi
|
||||
ldren's blood,"
|
||||
|
||||
First, letʼs use the sqlite command line tool to create a new database
|
||||
and import the CSV.
|
||||
$ sqlite3 shakespeare.db
|
||||
sqlite> .mode csv
|
||||
sqlite> .import shakespeare_data.csv import
|
||||
|
||||
Done! A couple of SELECTs will let us quickly see if it worked.
|
||||
sqlite> SELECT count(*) FROM import;
|
||||
111396
|
||||
sqlite> SELECT * FROM import LIMIT 10;
|
||||
1,"Henry IV","","","","ACT I"
|
||||
2,"Henry IV","","","","SCENE I. London. The palace."
|
||||
3,"Henry IV","","","","Enter KING HENRY, LORD JOHN OF LANCASTER, the EARL of WES
|
||||
TMORELAND, SIR WALTER BLUNT, and others"
|
||||
4,"Henry IV",1,1.1.1,"KING HENRY IV","So shaken as we are, so wan with care,"
|
||||
5,"Henry IV",1,1.1.2,"KING HENRY IV","Find we a time for frighted peace to pant,
|
||||
"
|
||||
6,"Henry IV",1,1.1.3,"KING HENRY IV","And breathe short-winded accents of new br
|
||||
oils"
|
||||
7,"Henry IV",1,1.1.4,"KING HENRY IV","To be commenced in strands afar remote."
|
||||
8,"Henry IV",1,1.1.5,"KING HENRY IV","No more the thirsty entrance of this soil"
|
||||
9,"Henry IV",1,1.1.6,"KING HENRY IV","Shall daub her lips with her own children'
|
||||
s blood,"
|
||||
|
||||
Looks good! Now we can do a little cleanup. The original CSV contains a
|
||||
column called AceSceneLine that uses dots to encode Act number, Scene
|
||||
number, and Line number. Those would look much nicer as their own
|
||||
columns.
|
||||
sqlite> CREATE TABLE plays (rowid INTEGER PRIMARY KEY, play, linenumber, act, sc
|
||||
ene, line, player, text);
|
||||
sqlite> .schema
|
||||
CREATE TABLE import (rowid primary key, play, playerlinenumber, actsceneline, pl
|
||||
ayer, playerline);
|
||||
CREATE TABLE plays (rowid primary key, play, linenumber, act, scene, line, playe
|
||||
r, text);
|
||||
sqlite> INSERT INTO plays SELECT
|
||||
row AS rowid,
|
||||
play,
|
||||
playerlinenumber AS linenumber,
|
||||
substr(actsceneline, 1, 1) AS act,
|
||||
substr(actsceneline, 3, 1) AS scene,
|
||||
substr(actsceneline, 5, 5) AS line,
|
||||
player,
|
||||
playerline AS text
|
||||
FROM import;
|
||||
|
||||
(The substr above can be improved by using instr to find the ʼ.ʼ
|
||||
characters. Exercise left for the reader.)
|
||||
|
||||
Here we used the INSERT ... SELECT syntax to build a table out of
|
||||
another table. The ActSceneLine column was split apart using the
|
||||
builtin SQLite function substr, which slices strings.
|
||||
|
||||
The result:
|
||||
sqlite> SELECT * FROM plays LIMIT 10;
|
||||
1,"Henry IV","","","","","","ACT I"
|
||||
2,"Henry IV","","","","","","SCENE I. London. The palace."
|
||||
3,"Henry IV","","","","","","Enter KING HENRY, LORD JOHN OF LANCASTER, the EARL
|
||||
of WESTMORELAND, SIR WALTER BLUNT, and others"
|
||||
4,"Henry IV",1,1,1,1,"KING HENRY IV","So shaken as we are, so wan with care,"
|
||||
5,"Henry IV",1,1,1,2,"KING HENRY IV","Find we a time for frighted peace to pant,
|
||||
"
|
||||
6,"Henry IV",1,1,1,3,"KING HENRY IV","And breathe short-winded accents of new br
|
||||
oils"
|
||||
7,"Henry IV",1,1,1,4,"KING HENRY IV","To be commenced in strands afar remote."
|
||||
8,"Henry IV",1,1,1,5,"KING HENRY IV","No more the thirsty entrance of this soil"
|
||||
9,"Henry IV",1,1,1,6,"KING HENRY IV","Shall daub her lips with her own children'
|
||||
s blood,"
|
||||
|
||||
Now we have our data, let us search for something:
|
||||
sqlite> SELECT * FROM plays WHERE text LIKE "whether tis nobler%";
|
||||
sqlite>
|
||||
|
||||
That did not work. Hamlet definitely says that, but perhaps the text
|
||||
formatting is slightly off. SQLite to the rescue. It ships with a Full
|
||||
Text Search extension compiled in. Let us index all of Shakespeare with
|
||||
FTS5:
|
||||
sqlite> CREATE VIRTUAL TABLE playsearch USING fts5(playsrowid, text);
|
||||
sqlite> INSERT INTO playsearch SELECT rowid, text FROM plays;
|
||||
|
||||
Now we can search for our soliloquy:
|
||||
sqlite> SELECT rowid, text FROM playsearch WHERE text MATCH "whether tis nobler"
|
||||
;
|
||||
34232|Whether 'tis nobler in the mind to suffer
|
||||
|
||||
Success! The act and scene can be acquired by joining with our original
|
||||
table.
|
||||
sqlite> SELECT play, act, scene, line, player, plays.text
|
||||
FROM playsearch
|
||||
INNER JOIN plays ON playsearch.playsrowid = plays.rowid
|
||||
WHERE playsearch.text MATCH "whether tis nobler";
|
||||
Hamlet|3|1|65|HAMLET|Whether 'tis nobler in the mind to suffer
|
||||
|
||||
Letʼs clean up.
|
||||
sqlite> DROP TABLE import;
|
||||
sqlite> VACUUM;
|
||||
|
||||
Finally, what does all of this look like on the file system?
|
||||
$ ls -l
|
||||
-rwxr-xr-x@ 1 crawshaw staff 10188854 Apr 27 2017 shakespeare_data.csv
|
||||
-rw-r--r-- 1 crawshaw staff 22286336 Jul 25 22:05 shakespeare.db
|
||||
|
||||
There you have it. The SQLite database contains two full copies of the
|
||||
plays of Shakespeare, one with a full text search index, and stores
|
||||
both of them in about twice the space it takes the original CSV file to
|
||||
store one. Not bad.
|
||||
|
||||
That should give you a feel for the i-t-e of SQLite.
|
||||
|
||||
And scene.
|
||||
|
||||
Using SQLite from Go
|
||||
|
||||
The standard database/sql
|
||||
|
||||
There are a number of cgo-based [4]database/sql drivers available for
|
||||
SQLite. The most popular one appears to be
|
||||
[5]github.com/mattn/go-sqlite3. It gets the job done and is probably
|
||||
what you want.
|
||||
|
||||
Using the database/sql package it is straightforward to open an SQLite
|
||||
database and execute SQL statements on it. For example, we can run the
|
||||
FTS query from earlier using this Go code:
|
||||
package main
|
||||
|
||||
import (
|
||||
"database/sql"
|
||||
"fmt"
|
||||
"log"
|
||||
|
||||
_ "github.com/mattn/go-sqlite3"
|
||||
)
|
||||
|
||||
func main() {
|
||||
db, err := sql.Open("sqlite3", "shakespeare.db")
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
defer db.Close()
|
||||
stmt, err := db.Prepare(`
|
||||
SELECT play, act, scene, plays.text
|
||||
FROM playsearch
|
||||
INNER JOIN plays ON playsearch.playrowid = plays.rowid
|
||||
WHERE playsearch.text MATCH ?;`)
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
var play, text string
|
||||
var act, scene int
|
||||
err = stmt.QueryRow("whether tis nobler").Scan(&play, &act, &scene, &tex
|
||||
t)
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
fmt.Printf("%s %d:%d: %q\n", play, act, scene, text)
|
||||
}
|
||||
|
||||
Executing it yields:
|
||||
Hamlet 3:1 "Whether 'tis nobler in the mind to suffer"
|
||||
|
||||
A low-level wrapper: crawshaw.io/sqlite
|
||||
|
||||
Just as SQLite steps beyond the basics of SELECT, INSERT, UPDATE,
|
||||
DELETE with full-text search, it has several other interesting features
|
||||
and extensions that cannot be accessed by SQL statements alone. These
|
||||
need specialized interfaces, and many of the interfaces are not
|
||||
supported by any of the existing drivers.
|
||||
|
||||
So I wrote my own. You can get it from [6]crawshaw.io/sqlite. In
|
||||
particular, it supports the streaming blob interface, the [7]session
|
||||
extension, and implements the necessary sqlite_unlock_notify machinery
|
||||
to make good use of the [8]shared cache for connection pools. I am
|
||||
going to cover these features through two use case studies: the client
|
||||
and the cloud.
|
||||
|
||||
cgo
|
||||
|
||||
All of these approaches rely on cgo for integrating C into Go. This is
|
||||
straightforward to do, but adds some operational complexity. Building a
|
||||
Go program using SQLite requires a C compiler for the target.
|
||||
|
||||
In practice, this means if you develop on macOS you need to install a
|
||||
cross-compiler for linux.
|
||||
|
||||
Typical concerns about the impact on software quality of adding C code
|
||||
to Go do not apply to SQLite as it has an extraordinary degree of
|
||||
testing. The quality of the code is exceptional.
|
||||
|
||||
Go and SQLite for the client
|
||||
|
||||
I am building an [9]iOS app, with almost all the code written in Go and
|
||||
the UI provided by a web view. This app has a full copy of the user
|
||||
data, it is not a thin view onto an internet server. This means storing
|
||||
a large amount of local, structured data, on-device full text
|
||||
searching, background tasks working on the database in a way that does
|
||||
not disrupt the UI, and syncing DB changes to a backup in the cloud.
|
||||
|
||||
That is a lot of moving parts for a client. More than I want to write
|
||||
in JavaScript, and more than I want to write in Swift and then have to
|
||||
promptly rewrite if I ever manage to build an Android app. More
|
||||
importantly, the server is in Go, and I am one independent developer.
|
||||
It is absolutely vital I reduce the number of moving pieces in my
|
||||
development environment to the smallest possible number. Hence the
|
||||
effort to build (the big bits) of a client using the exact same
|
||||
technology as my server.
|
||||
|
||||
The Session extension
|
||||
|
||||
The session extension lets you start a session on an SQLite connection.
|
||||
All changes made to the database through that connection are bundled
|
||||
into a patchset blob. The extension also provides method for applying
|
||||
the generated patchset to a table.
|
||||
func (conn *Conn) CreateSession(db string) (*Session, error)
|
||||
|
||||
func (s *Session) Changeset(w io.Writer) error
|
||||
|
||||
func (conn *Conn) ChangesetApply(
|
||||
r io.Reader,
|
||||
filterFn func(tableName string) bool,
|
||||
conflictFn func(ConflictType, ChangesetIter) ConflictAction,
|
||||
) error
|
||||
|
||||
This can be used to build a very simple client-sync system. Collect the
|
||||
changes made in a client, periodically bundle them up into a changeset
|
||||
and upload it to the server where it is applied to a backup copy of the
|
||||
database. If another client changes the database then the server
|
||||
advertises it to the client, who downloads a changeset and applies it.
|
||||
|
||||
This requires a bit of care in the database design. The reason I kept
|
||||
the FTS table separate in the Shakespeare example is I keep my FTS
|
||||
tables in a separate attached database (which in SQLite, means a
|
||||
different file). The cloud backup database never generates the FTS
|
||||
tables, the client is free to generate the tables in a background
|
||||
thread and they can lag behind data backups.
|
||||
|
||||
Another point of care is minimizing conflicts. The biggest one is
|
||||
AUTOINCREMENT keys. By default the primary key of a rowid table is
|
||||
incremented, which means if you have multiple clients generating rowids
|
||||
you will see lots of conflicts.
|
||||
|
||||
I have been trialing two different solutions. The first is having each
|
||||
client register a rowid range with the server and only allocate from
|
||||
its own range. It works. The second is randomly generating int64
|
||||
values, and relying on the low collision rate. So far it works too.
|
||||
Both strategies have risks, and I havenʼt decided which is better.
|
||||
|
||||
In practice, I have found I have to limit DB updates to a single
|
||||
connection to keep changeset quality high. (A changeset does not see
|
||||
changes made on other connections.) To do this I maintain a read-only
|
||||
pool of connections and a single guarded read-write connection in a
|
||||
pool of 1. The code only grabs the read-write connection when it needs
|
||||
it, and the read-only connections are enforced by the read-only bit on
|
||||
the SQLite connection.
|
||||
|
||||
Nested Transactions
|
||||
|
||||
The database/sql driver encourages the use of SQL transactions with its
|
||||
Tx type, but this does not appear to play well with nested
|
||||
transactions. This is a concept implemented by SAVEPOINT / RELEASE in
|
||||
SQL, and it makes for surprisingly composable code.
|
||||
|
||||
If a function needs to make multiple statements in a transaction, it
|
||||
can open with a SAVEPOINT, then defer a call to RELEASE if the function
|
||||
produces no Go return error, or if it does instead call ROLLBACK and
|
||||
return the error.
|
||||
func f(conn *sqlite.Conn) (err error) {
|
||||
conn...SAVEPOINT
|
||||
defer func() {
|
||||
if err == nil {
|
||||
conn...RELEASE
|
||||
} else {
|
||||
conn...ROLLBACK
|
||||
}
|
||||
}()
|
||||
}
|
||||
|
||||
Now if this transactional function f needs to call another
|
||||
transactional function g, then g can use exactly the same strategy and
|
||||
f can call it in a very traditional Go way:
|
||||
if err := g(conn); err != nil {
|
||||
return err // all changes in f will be rolled back by the defer
|
||||
}
|
||||
|
||||
The function g is also perfectly safe to use in its own right, as it
|
||||
has its own transaction.
|
||||
|
||||
I have been using this SAVEPOINT + defer RELEASE or return an error
|
||||
semantics for several months now and find it invaluable. It makes it
|
||||
easy to safely wrap code in SQL transactions.
|
||||
|
||||
The example above however is a bit bulky, and there are some edge cases
|
||||
that need to be handled. (For example, if the RELEASE fails, then an
|
||||
error needs to be returned.) So I have wrapped this up in a utility:
|
||||
func f(conn *sqlite.Conn) (err error) {
|
||||
defer sqlitex.Save(conn)(&err)
|
||||
|
||||
// Code is transactional and can be stacked
|
||||
// with other functions that call sqlitex.Save.
|
||||
}
|
||||
|
||||
The first time you see sqlitex.Save in action it can be a little
|
||||
off-putting, at least it was for me when I first created it. But I
|
||||
quickly got used to it, and it does a lot of heavy lifting. The first
|
||||
call to sqlitex.Save opens a SAVEPOINT on the conn and returns a
|
||||
closure that either RELEASEs or ROLLBACKs depending on the value of
|
||||
err, and sets err if necessary.
|
||||
|
||||
Go and SQLite in the cloud
|
||||
|
||||
I have spent several months now redesigning services I have encountered
|
||||
before and designing services for problems I would like to work on
|
||||
going forward. The process has led me to a general design that works
|
||||
for many problems and I quite enjoy building.
|
||||
|
||||
It can be summarized as 1 VM, 1 Zone, 1 process programming.
|
||||
|
||||
If this sounds ridiculously simplistic to you, I think thatʼs good! It
|
||||
is simple. It does not meet all sorts of requirements that we would
|
||||
like our modern fancy cloud services to meet. It is not "serverless",
|
||||
which means when a service is extremely small it does not run for free,
|
||||
and when a service grows it does not automatically scale. Indeed, there
|
||||
is an explicit scaling limit. Right now the best server you can get
|
||||
from Amazon is roughly:
|
||||
* 128 CPU threads at ~4GHz
|
||||
* 4TB RAM
|
||||
* 25 Gbit ethernet
|
||||
* 10 Gbps NAS
|
||||
* hours of yearly downtime
|
||||
|
||||
That is a huge potential downside of of one process programming.
|
||||
However, I claim that is a livable limit.
|
||||
|
||||
I claim typical services do not hit this scaling limit.
|
||||
|
||||
If you are building a small business, most products can grow and become
|
||||
profitable well under this limit for years. When you see the limit
|
||||
approaching in the next year or two, you have a business with revenue
|
||||
to hire more than one engineer, and the new team can, in the face of
|
||||
radically changing business requirements, rewrite the service.
|
||||
|
||||
Reaching this limit is a good problem to have because when it comes you
|
||||
will have plenty of time to deal with it and the human resources you
|
||||
need to solve it well.
|
||||
|
||||
Early in the life of a small business you donʼt, and every hour you
|
||||
spend trying to work beyond this scaling limit is an hour that would
|
||||
have been better spent talking to your customers about their needs.
|
||||
|
||||
The principle at work here is:
|
||||
|
||||
Donʼt use N computers when 1 will do.
|
||||
|
||||
To go into a bit more technical detail,
|
||||
|
||||
I run a single VM on AWS, in a single availability zone. The VM has
|
||||
three EBS volumes (this is Amazon name for NAS). The first holds the
|
||||
OS, logs, temporary files, and any ephemeral SQLite databases that are
|
||||
generated from the main databases, e.g. FTS tables. The second the
|
||||
primary SQLite database for the main service. The third holds the
|
||||
customer sync SQLite databases.
|
||||
|
||||
The system is configured to periodically snapshot the system EBS volume
|
||||
and the customer EBS volumes to S3, the Amazon geo-redundant blob
|
||||
store. This is a relatively cheap operation that can be scripted,
|
||||
because only blocks that change are copied.
|
||||
|
||||
The main EBS volume is backed up to S3 very regularly, by custom code
|
||||
that flushes the WAL cache. Iʼll explain that in a bit.
|
||||
|
||||
The service is a single Go binary running on this VM. The machine has
|
||||
plenty of extra RAM that is used by linuxʼs disk cache. (And that can
|
||||
be used by a second copy of the service spinning up for low down-time
|
||||
replacement.)
|
||||
|
||||
The result of this is a service that has at most tens of hours of
|
||||
downtime a year, about as much change of suffering block loss as a
|
||||
physical computer with a RAID5 array, and active offsite backups being
|
||||
made every few minutes to a distributed system that is built and
|
||||
maintained by a large team.
|
||||
|
||||
This system is astonishingly simple. I shell into one machine. It is a
|
||||
linux machine. I have a deploy script for the service that is ten lines
|
||||
long. Almost all of my performance work is done with pprof.
|
||||
|
||||
On a medium sized VM I can clock 5-6 thousand concurrent requests with
|
||||
only a few hours of performance tuning. On the largest machine AWS has,
|
||||
tens of thousands.
|
||||
|
||||
Now to talk a little more about the particulars of the stack:
|
||||
|
||||
Shared cache and WAL
|
||||
|
||||
To make the server extremely concurrent there are two important SQLite
|
||||
features I use. The first is the shared cache, which lets me allocate
|
||||
one large pool of memory to the database page cache and many concurrent
|
||||
connections can use it simultaneously. This requires some support in
|
||||
the driver for sqlite_unlock_notify so user code doesnʼt need to deal
|
||||
with locking events, but that is transparent to end user code.
|
||||
|
||||
The second is the Write Ahead Log. This is a mode SQLite can be knocked
|
||||
into at the beginning of connection which changes the way it writes
|
||||
transactions to disk. Instead of locking the database and making
|
||||
modifications along with a rollback journal, it appends the new change
|
||||
to a separate file. This allows readers to work concurrently with the
|
||||
writer. The WAL has to be flushed periodically by SQLite, which
|
||||
involves locking the database and writing the changes from it. There
|
||||
are default settings for doing this.
|
||||
|
||||
I override these and execute WAL flushes manually from a package that,
|
||||
when it is done, also triggers an S3 snapshot. This package is called
|
||||
reallyfsync, and if I can work out how to test it properly I will make
|
||||
it open source.
|
||||
|
||||
Incremental Blob API
|
||||
|
||||
Another smaller, but important to my particular server feature, is
|
||||
SQLiteʼs [10]incremental blob API. This allows a field of bytes to be
|
||||
read and written in the DB without storing all the bytes in memory
|
||||
simultaneously, which matters when it is possible for each request to
|
||||
be working with hundreds of megabytes, but you want tens of thousands
|
||||
of potential concurrent requests.
|
||||
|
||||
This is one of the places where the driver deviates from being a
|
||||
close-to-cgo wrapper to be more [11]Go-like:
|
||||
type Blob
|
||||
func (blob *Blob) Close() error
|
||||
func (blob *Blob) Read(p []byte) (n int, err error)
|
||||
func (blob *Blob) ReadAt(p []byte, off int64) (n int, err error)
|
||||
func (blob *Blob) Seek(offset int64, whence int) (int64, error)
|
||||
func (blob *Blob) Size() int64
|
||||
func (blob *Blob) Write(p []byte) (n int, err error)
|
||||
func (blob *Blob) WriteAt(p []byte, off int64) (n int, err error)
|
||||
|
||||
This looks a lot like a file, and indeed can be used like a file, with
|
||||
one caveat: the size of a blob is set when it is created. (As such, I
|
||||
still find temporary files to be useful.)
|
||||
|
||||
Designing with one process programming
|
||||
|
||||
I start with: Do you really need N computers?
|
||||
|
||||
Some problems really do. For example, you cannot build a low-latency
|
||||
index of the public internet with only 4TB of RAM. You need a lot more.
|
||||
These problems are great fun, and we like to talk a lot about them, but
|
||||
they are a relatively small amount of all the code written. So far all
|
||||
the projects I have been developing post-Google fit on 1 computer.
|
||||
|
||||
There are also more common sub-problems that are hard to solve with one
|
||||
computer. If you have a global customer base and need low-latency to
|
||||
your server, the speed of light gets in the way. But many of these
|
||||
problems can be solved with relatively straightforward CDN products.
|
||||
|
||||
Another great solution to the speed of light is geo-sharding. Have
|
||||
complete and independent copies of your service in multiple
|
||||
datacenters, move your userʼs data to the service near them. This can
|
||||
be as easy as having one small global redirect database (maybe SQLite
|
||||
on geo-redundant NFS!) redirecting the user to a specific DNS name like
|
||||
{us-east, us-west}.mservice.com.
|
||||
|
||||
Most problems do fit in one computer, up to a point. Spend some time
|
||||
determining where that point is. If it is years away there is a good
|
||||
chance one computer will do.
|
||||
|
||||
Indie dev techniques for the corporate programmer
|
||||
|
||||
Even if you do not write code in this particular technology stack and
|
||||
you are not an independent developer, there is value here. Use the one
|
||||
big VM, one zone, one process Go, SQLite, and snapshot backup stack as
|
||||
a hypothetical tool to test your designs.
|
||||
|
||||
So add a hypothetical step to your design process: If you solved your
|
||||
problem on this stack with one computers, how far could you get? How
|
||||
many customers could you support? At what size would you need to
|
||||
rewrite your software?
|
||||
|
||||
If this indie mini stack would last your business years, you might want
|
||||
to consider delaying the adoption of modern cloud software.
|
||||
|
||||
If you are a programmer at a well-capitalized company, you may also
|
||||
want to consider what development looks like for small internal or
|
||||
experimental projects. Do your coworkers have to use large complex
|
||||
distributed systems for policy reasons? Many of these projects will
|
||||
never need to scale beyond one computer, or if they do they will need a
|
||||
rewrite to deal with shifting requirements. In which case, find a way
|
||||
to make an indie stack, linux VMs with a file system, available for
|
||||
prototyping and experimentation.
|
||||
__________________________________________________________________
|
||||
|
||||
[12]Index
|
||||
[13]github.com/crawshaw
|
||||
[14]twitter.com/davidcrawshaw
|
||||
david@zentus.com
|
||||
|
||||
References
|
||||
|
||||
1. file:///atom.xml
|
||||
2. https://gonorthwest.io/
|
||||
3. https://www.kaggle.com/kingburrito666/shakespeare-plays
|
||||
4. https://golang.org/pkg/database/sql
|
||||
5. https://github.com/mattn/go-sqlite3
|
||||
6. https://crawshaw.io/sqlite
|
||||
7. https://www.sqlite.org/sessionintro.html
|
||||
8. https://www.sqlite.org/sharedcache.html
|
||||
9. https://www.posticulous.com/
|
||||
10. https://www.sqlite.org/c3ref/blob_open.html
|
||||
11. https://godoc.org/crawshaw.io/sqlite#Blob
|
||||
12. file:///
|
||||
13. https://github.com/crawshaw
|
||||
14. https://twitter.com/davidcrawshaw
|
||||
Reference in New Issue
Block a user