From e1228a6e9a093066e8556a25df76615edcb34044 Mon Sep 17 00:00:00 2001 From: David Eisinger Date: Tue, 13 Jun 2023 10:52:21 -0400 Subject: [PATCH] Add go notes --- content/notes/golang/index.md | 11 + static/archive/cloud-google-com-windxx.txt | 365 +++++++++++++ static/archive/crawshaw-io-k5slfj.txt | 574 +++++++++++++++++++++ 3 files changed, 950 insertions(+) create mode 100644 static/archive/cloud-google-com-windxx.txt create mode 100644 static/archive/crawshaw-io-k5slfj.txt diff --git a/content/notes/golang/index.md b/content/notes/golang/index.md index e8ed23c..e1f135a 100644 --- a/content/notes/golang/index.md +++ b/content/notes/golang/index.md @@ -2,6 +2,15 @@ title: "Golang" date: 2023-05-08T09:54:48-04:00 draft: false +references: +- title: "Why David Yach Loves Go" + url: https://cloud.google.com/blog/products/application-modernization/why-david-yach-loves-go + date: 2023-06-13T14:51:05Z + file: cloud-google-com-windxx.txt +- title: "One process programming notes (with Go and SQLite)" + url: https://crawshaw.io/blog/one-process-programming-notes + date: 2023-06-13T14:49:51Z + file: crawshaw-io-k5slfj.txt --- I find [Go][1] really compelling, even though it's not super applicable to my job. When evaluating a new tool, I find I'm weirdly biased to things written in Go. @@ -51,7 +60,9 @@ I find [Go][1] really compelling, even though it's not super applicable to my jo * [Standard Go Project Layout][8] * [The files & folders of Go projects][9] * [Why David Yach Loves Go][10] +* [One process programming notes (with Go and SQLite)][11] [8]: https://github.com/golang-standards/project-layout [9]: https://changelog.com/gotime/278 [10]: https://cloud.google.com/blog/products/application-modernization/why-david-yach-loves-go +[11]: https://crawshaw.io/blog/one-process-programming-notes diff --git a/static/archive/cloud-google-com-windxx.txt b/static/archive/cloud-google-com-windxx.txt new file mode 100644 index 0000000..8c4806f --- /dev/null +++ b/static/archive/cloud-google-com-windxx.txt @@ -0,0 +1,365 @@ + #[1]home + + [2]Jump to Content + + Cloud + (BUTTON) + [3]Blog + [4]Contact sales [5]Get started for free + ____________________ + + Cloud + [6]Blog + + * Solutions & technology + + [7]AI & Machine Learning + + [8]API Management + + [9]Application Development + + [10]Application Modernization + + [11]Chrome Enterprise + + [12]Compute + + [13]Containers & Kubernetes + + [14]Data Analytics + + [15]Databases + + [16]DevOps & SRE + + [17]Maps & Geospatial + + [18]Security & Identity + + [19]Infrastructure + + [20]Infrastructure Modernization + + [21]Networking + + [22]Productivity & Collaboration + + [23]SAP on Google Cloud + + [24]Storage & Data Transfer + + [25]Sustainability + * Ecosystem + + [26]IT Leaders + + Industries + o [27]Financial Services + o [28]Healthcare & Life Sciences + o [29]Manufacturing + o [30]Media & Entertainment + o [31]Public Sector + o [32]Retail + o [33]Supply Chain + o [34]Telecommunications + + [35]Partners + + [36]Startups & SMB + + [37]Training & Certifications + + [38]Inside Google Cloud + + [39]Google Cloud Next & Events + + [40]Google Maps Platform + + [41]Google Workspace + * [42]Developers & Practitioners + * [43]Transform with Google Cloud + + [44]Contact sales [45]Get started for free + ____________________ + + Application Modernization + + Why I love Go + September 12, 2022 + * + * + * + * + +David Yach + + Director of Engineering at Google Cloud + + Iâve been building software over the last four decades, as a developer, + manager and executive in both small and large software companies. I + started my career working on commercial compilers, first BASIC and then + C. I have written a lot of code in many different languages, and + managed teams with even broader language usage. + + I learned Go about 5 years ago when I was CTO at a startup/scaleup. At + the time, we were looking to move to a microservice architecture, and + that shift gave us the opportunity to consider moving away from the + incumbent language (Scala). As I read through the Go tutorials, my + compiler-writing background came back to me and I found myself + repeatedly thinking âThatâs cool â I know why the Go team did that!â So + I got hooked on the language design. + +Learning + + I have worked with many different computer languages over the years, so + I was not surprised I could quickly get started writing Go programs + after reading through the online documents and tutorials. But then when + I saw a new co-op student (a.k.a. intern) learn Go and write a + substantial prototype in their first two weeks on the job, it became + clear that Go was much easier to learn than many other languages. + +Writing code + + As I started writing my first Go programs, the first thing that struck + me was the blazing compiler speed. It was as fast or faster starting my + application than many interpreted languages, yet it was a compiled + program with a strongly typed language. (I have an affinity for + strongly typed languages â I have spent way too much time tracking down + obscure issues in my own code in dynamic typed languages, where the + same issue would have been a compile error in a strongly typed + language.) Even better, in Go I often donât need to declare the type â + the compiler figures it out. + + I was impressed with the standard Go library â it included many of the + capabilities required by modern applications â things like HTTP + support, JSON handling and encryption. Many other languages required + you to use a third-party library for these features, and often there + were multiple competing libraries to choose from, adding another + decision point for the developer. With Go, I could go to the standard + library GoDoc and get started right away. + + There were a few other language decisions that I found helpful. One is + that the compiler figures out if you are returning a pointer to a + local, and behind the scenes allocates the memory rather than using the + stack. This prevents bugs, and I find the code more readable. + + I also like that you donât declare that you support an interface. I + wasnât sure I would like this at first because it isnât obvious if a + type implements a particular interface, but I found greater value in + the fact that I wasnât dependent on the code author (even if it was + me!) to declare that the interface is implemented. This first hit home + when I used fmt.Println() and it automatically used the String() method + I had implemented even though it hadnât occurred to me that I was + implementing the Stringer interface. + + The last feature Iâll note is the ability to do concurrent programming + through channels and goroutines. The model is simple to understand yet + powerful. + +Reading code + + After writing more Go code and starting to incorporate third party + libraries, I had a realization that had never occurred to me before â + as a developer, I spend a lot of time reading code. In fact, I probably + spend more time reading code than writing it, once you start counting + code reviews, debugging, and evaluating third-party libraries. + + What was different about reading Go code? I would summarize it by âit + all looks the same.â What do I mean by that? Go format ensures all the + braces are in the same spot; capitalized identifiers are exported; + there are no implicit conversions, even of internal types; and there is + no overloading of operators, functions or methods. That means that with + Go code, âwhat you see is what you getâ with no hidden meaning. Of + course, it doesnât help me to understand a complicated algorithm, but + it does mean that I can concentrate more on that algorithm because I + donât have to understand whether â+â is overloaded, for example. + + I was also pleasantly surprised when I used GoDoc on one of my + projects, and discovered that I had semi-reasonable documentation + without doing anything while writing the code other than adding + comments on my functions and methods based on nagging from the IDE I + was using. I did spend some time cleaning up the comments after that, + but Iâm not sure I would have even started that work if Go hadnât given + me a great starting point. + +Testing code + + Go test is part of the standard Go tools and supported by IDEs, making + it easy to get started creating unit tests for my code. And like the + standard Go library, having a standard way to do tests means I donât + have to evaluate external testing frameworks and select one. I can also + understand the tests when Iâm evaluating a third party library. + + Even better, the default behavior running package tests in VSCode is to + enable Goâs built-in code coverage. I had never taken code coverage + seriously working in other languages, partly because it was often + difficult to set up. But the immediate feedback (helped by the blazing + compile speed) gamified this for me, and I found myself adding tests to + increase code coverage (and finding new bugs along the way). + + Go doesnât allow circular dependencies between packages. While this has + caused me some rethinking while writing code, I find it makes my + testing regimen easier to think about â if I depend on a package, I can + rely on that package to have its own tests covering its capabilities. + +Deploying code + + I learned Go at the same time we were migrating towards container-based + microservices. In that environment, the fact that Go produces a single, + self-contained executable makes it much easier and more efficient to + build and manage containers. I can build a container layer with one + single file, which is often a single-digit number of MB in size, + compared to our prior JVM-based containers which started with hundreds + of MB for the Java runtime then another layer for our application. (It + is easy to forget how much this overhead ends up costing in production, + particularly if you have hundreds or thousands of containers running). + + Second, Go has built-in cross compiling capabilities so our development + machines, containers and cloud hardware donât all have to all be on the + same processor or operating system. For example, I can use a Linux + build machine to produce client executables for Linux, Mac and Windows. + Again, this takes away a complicated decision process due to artificial + constraints. + + Finally, Go has established a well defined set of principles for + versioning and compatibility. While not all pieces of this are + enforced, having the principles from an authoritative source helps + manage the real life challenges of keeping your software supply chain + up to date. For example, it is strongly recommended that breaking + changes require a new major version number. While not enforced, it + leads the community to call out any open source package that violates + this principle. + +What do I miss? + + I did miss generics; thankfully Go 1.18 added support. And I do wish + the standard library offered immutable collections (like Scala and + other functional languages). Embedding instead of inheritance works + pretty much the same in many cases, but requires some deep thinking + sometimes. + + My most frequent coding mistake is when I should have used a pointer + receiver for a method and didnât, then modify the receiver expecting + the changes to be visible when the method returns. The code looks + correct, the right values get assigned if I use a debugger to step + through or issue prints, but the changes disappear after the method + returns. I think I would have preferred if receivers were immutable, it + would have caught these errors at compile time, and in the few + remaining cases where I wanted to modify the receiver I would have + copied it to a local variable. + +In conclusion + + As you can tell, I am a huge fan of Go, from even before I joined + Google. I am impressed by the language and ecosystem design, and by the + implementation. For me, Go makes me a more productive developer and Iâm + more confident in the quality of the code I produce. + + Go, give it a [46]try! + + Posted in + * [47]Application Modernization + * [48]Application Development + * [49]Open Source + +Related articles + + https://storage.googleapis.com/gweb-cloudblog-publish/images/TMI_Blog_h + eader_2436x1200_Rnd2.max-700x700.jpg + Application Development + +The Modernization Imperative: Shifting left is for suckers. Shift down +instead + + By Richard Seroter ⢠5-minute read + + https://storage.googleapis.com/gweb-cloudblog-publish/images/DO_NOT_USE + _ps1BuN1.max-700x700.jpg + DevOps & SRE + +Config Connector: An easy way to manage your infrastructure in Google Cloud + + By Leonid Yankulin ⢠4-minute read + + https://storage.googleapis.com/gweb-cloudblog-publish/images/DO_NOT_USE + _Wfx45fA.max-700x700.jpg + Application Modernization + +Realizing cloud value for a render platform at Wayfair â Part 2 + + By Jack Brooks ⢠4-minute read + + https://storage.googleapis.com/gweb-cloudblog-publish/images/DO_NOT_USE + _Wfx45fA.max-700x700.jpg + Application Modernization + +Realizing cloud value for a render platform at Wayfair - Part 1 + + By Jack Brooks ⢠4-minute read + + Footer Links + +Follow us + + * + * + * + * + * + + * [50]Google Cloud + * [51]Google Cloud Products + * [52]Privacy + * [53]Terms + + * [54]Help + * [âªEnglishâ¬_____....] + +References + + Visible links: + 1. https://cloud.google.com/?lfhs=2 + 2. https://cloud.google.com/blog/#content + 3. https://cloud.google.com/blog + 4. https://cloud.google.com/contact/ + 5. https://console.cloud.google.com/freetrial/ + 6. https://cloud.google.com/blog + 7. https://cloud.google.com/blog/products/ai-machine-learning + 8. https://cloud.google.com/blog/products/api-management + 9. https://cloud.google.com/blog/products/application-development + 10. https://cloud.google.com/blog/products/application-modernization + 11. https://cloud.google.com/blog/products/chrome-enterprise + 12. https://cloud.google.com/blog/products/compute + 13. https://cloud.google.com/blog/products/containers-kubernetes + 14. https://cloud.google.com/blog/products/data-analytics + 15. https://cloud.google.com/blog/products/databases + 16. https://cloud.google.com/blog/products/devops-sre + 17. https://cloud.google.com/blog/topics/maps-geospatial + 18. https://cloud.google.com/blog/products/identity-security + 19. https://cloud.google.com/blog/products/infrastructure + 20. https://cloud.google.com/blog/products/infrastructure-modernization + 21. https://cloud.google.com/blog/products/networking + 22. https://cloud.google.com/blog/products/productivity-collaboration + 23. https://cloud.google.com/blog/products/sap-google-cloud + 24. https://cloud.google.com/blog/products/storage-data-transfer + 25. https://cloud.google.com/blog/topics/sustainability + 26. https://cloud.google.com/transform + 27. https://cloud.google.com/blog/topics/financial-services + 28. https://cloud.google.com/blog/topics/healthcare-life-sciences + 29. https://cloud.google.com/blog/topics/manufacturing + 30. https://cloud.google.com/blog/products/media-entertainment + 31. https://cloud.google.com/blog/topics/public-sector + 32. https://cloud.google.com/blog/topics/retail + 33. https://cloud.google.com/blog/topics/supply-chain-logistics + 34. https://cloud.google.com/blog/topics/telecommunications + 35. https://cloud.google.com/blog/topics/partners + 36. https://cloud.google.com/blog/topics/startups + 37. https://cloud.google.com/blog/topics/training-certifications + 38. https://cloud.google.com/blog/topics/inside-google-cloud + 39. https://cloud.google.com/blog/topics/google-cloud-next + 40. https://cloud.google.com/blog/products/maps-platform + 41. https://workspace.google.com/blog + 42. https://cloud.google.com/blog/topics/developers-practitioners + 43. https://cloud.google.com/transform + 44. https://cloud.google.com/contact/ + 45. https://console.cloud.google.com/freetrial/ + 46. https://go.dev/tour/list + 47. https://cloud.google.com/blog/products/application-modernization + 48. https://cloud.google.com/blog/products/application-development + 49. https://cloud.google.com/blog/products/open-source + 50. https://cloud.google.com/ + 51. https://cloud.google.com/products/ + 52. https://myaccount.google.com/privacypolicy?hl=en-US + 53. https://myaccount.google.com/termsofservice?hl=en-US + 54. https://support.google.com/ + + Hidden links: + 56. https://cloud.google.com/ + 57. https://cloud.google.com/ + 58. https://twitter.com/intent/tweet?text=Why%20I%20love%20Go%20@googlecloud&url=https://cloud.google.com/blog/products/application-modernization/why-david-yach-loves-go + 59. https://www.linkedin.com/shareArticle?mini=true&url=https://cloud.google.com/blog/products/application-modernization/why-david-yach-loves-go&title=Why%20I%20love%20Go + 60. https://www.facebook.com/sharer/sharer.php?caption=Why%20I%20love%20Go&u=https://cloud.google.com/blog/products/application-modernization/why-david-yach-loves-go + 61. mailto:?subject=Why%20I%20love%20Go&body=Check%20out%20this%20article%20on%20the%20Cloud%20Blog:%0A%0AWhy%20I%20love%20Go%0A%0ALearn%20all%20the%20reasons%20David%20Yach,%20industry%20veteran%20and%20Director%20of%20Engineering%20at%20Google%20Cloud,%20loves%20to%20use%20Go%20for%20software%20development.%0A%0Ahttps://cloud.google.com/blog/products/application-modernization/why-david-yach-loves-go + 62. https://cloud.google.com/blog/products/application-development/richard-seroter-on-shifting-down-vs-shifting-left + 63. https://cloud.google.com/blog/products/devops-sre/how-config-connector-compares-for-infrastructure-management + 64. https://cloud.google.com/blog/products/application-modernization/wayfair-identified-strategies-to-optimize-cloud-workloads-part-2 + 65. https://cloud.google.com/blog/products/application-modernization/wayfair-identified-strategies-to-optimize-cloud-workloads-part-1 + 66. https://www.twitter.com/googlecloud + 67. https://www.youtube.com/googlecloud + 68. https://www.linkedin.com/showcase/google-cloud + 69. https://www.instagram.com/googlecloud/ + 70. https://www.facebook.com/googlecloud/ + 71. https://cloud.google.com/ diff --git a/static/archive/crawshaw-io-k5slfj.txt b/static/archive/crawshaw-io-k5slfj.txt new file mode 100644 index 0000000..62ff2e9 --- /dev/null +++ b/static/archive/crawshaw-io-k5slfj.txt @@ -0,0 +1,574 @@ + #[1]crawshaw.io atom feed + +One process programming notes (with Go and SQLite) + + 2018 July 30 + + Blog-ified version of a talk I gave at [2]Go Northwest. + + This content covers my recent exploration of writing internet services, + iOS apps, and macOS programs as an indie developer. + + There are several topics here that should each have their own blog + post. But as I have a lot of programming to do I am going to put these + notes up as is and split the material out some time later. + + My focus has been on how to adapt the lessons I have learned working in + teams at Google to a single programmer building small business work. + There are many great engineering practices in Silicon Valleyʼs big + companies and well-capitalized VC firms, but one person does not have + enough bandwidth to use them all and write software. The exercise for + me is: what to keep and what must go. + + If I have been doing it right, the technology and techniques described + here will sound easy. I have to fit it all in my head while having + enough capacity left over to write software people want. Every extra + thing has great cost, especially rarely touched software that comes + back to bite in the middle of the night six months later. + + Two key technologies I have decided to use are Go and SQLite. + +A brief introduction to SQLite + + SQLite is an implementation of SQL. Unlike traditional database + implementations like PostgreSQL or MySQL, SQLite is a self-contained C + library designed to be embedded into programs. It has been built by D. + Richard Hipp since its release in 2000, and in the past 18 years other + open source contributors have helped. At this point it has been around + most of the time I have been programming and is a core part of my + programming toolbox. + +Hands-on with the SQLite command line tool + + Rather than talk through SQLite in the abstract, let me show it to you. + + A kind person on Kaggle has [3]provided a CSV file of the plays of + Shakespeare. Letʼs build an SQLite database out of it. +$ head shakespeare_data.csv +"Dataline","Play","PlayerLinenumber","ActSceneLine","Player","PlayerLine" +"1","Henry IV",,,,"ACT I" +"2","Henry IV",,,,"SCENE I. London. The palace." +"3","Henry IV",,,,"Enter KING HENRY, LORD JOHN OF LANCASTER, the EARL of WESTMOR +ELAND, SIR WALTER BLUNT, and others" +"4","Henry IV","1","1.1.1","KING HENRY IV","So shaken as we are, so wan with car +e," +"5","Henry IV","1","1.1.2","KING HENRY IV","Find we a time for frighted peace to + pant," +"6","Henry IV","1","1.1.3","KING HENRY IV","And breathe short-winded accents of +new broils" +"7","Henry IV","1","1.1.4","KING HENRY IV","To be commenced in strands afar remo +te." +"8","Henry IV","1","1.1.5","KING HENRY IV","No more the thirsty entrance of this + soil" +"9","Henry IV","1","1.1.6","KING HENRY IV","Shall daub her lips with her own chi +ldren's blood," + + First, letʼs use the sqlite command line tool to create a new database + and import the CSV. +$ sqlite3 shakespeare.db +sqlite> .mode csv +sqlite> .import shakespeare_data.csv import + + Done! A couple of SELECTs will let us quickly see if it worked. +sqlite> SELECT count(*) FROM import; +111396 +sqlite> SELECT * FROM import LIMIT 10; +1,"Henry IV","","","","ACT I" +2,"Henry IV","","","","SCENE I. London. The palace." +3,"Henry IV","","","","Enter KING HENRY, LORD JOHN OF LANCASTER, the EARL of WES +TMORELAND, SIR WALTER BLUNT, and others" +4,"Henry IV",1,1.1.1,"KING HENRY IV","So shaken as we are, so wan with care," +5,"Henry IV",1,1.1.2,"KING HENRY IV","Find we a time for frighted peace to pant, +" +6,"Henry IV",1,1.1.3,"KING HENRY IV","And breathe short-winded accents of new br +oils" +7,"Henry IV",1,1.1.4,"KING HENRY IV","To be commenced in strands afar remote." +8,"Henry IV",1,1.1.5,"KING HENRY IV","No more the thirsty entrance of this soil" +9,"Henry IV",1,1.1.6,"KING HENRY IV","Shall daub her lips with her own children' +s blood," + + Looks good! Now we can do a little cleanup. The original CSV contains a + column called AceSceneLine that uses dots to encode Act number, Scene + number, and Line number. Those would look much nicer as their own + columns. +sqlite> CREATE TABLE plays (rowid INTEGER PRIMARY KEY, play, linenumber, act, sc +ene, line, player, text); +sqlite> .schema +CREATE TABLE import (rowid primary key, play, playerlinenumber, actsceneline, pl +ayer, playerline); +CREATE TABLE plays (rowid primary key, play, linenumber, act, scene, line, playe +r, text); +sqlite> INSERT INTO plays SELECT + row AS rowid, + play, + playerlinenumber AS linenumber, + substr(actsceneline, 1, 1) AS act, + substr(actsceneline, 3, 1) AS scene, + substr(actsceneline, 5, 5) AS line, + player, + playerline AS text + FROM import; + + (The substr above can be improved by using instr to find the ʼ.ʼ + characters. Exercise left for the reader.) + + Here we used the INSERT ... SELECT syntax to build a table out of + another table. The ActSceneLine column was split apart using the + builtin SQLite function substr, which slices strings. + + The result: +sqlite> SELECT * FROM plays LIMIT 10; +1,"Henry IV","","","","","","ACT I" +2,"Henry IV","","","","","","SCENE I. London. The palace." +3,"Henry IV","","","","","","Enter KING HENRY, LORD JOHN OF LANCASTER, the EARL +of WESTMORELAND, SIR WALTER BLUNT, and others" +4,"Henry IV",1,1,1,1,"KING HENRY IV","So shaken as we are, so wan with care," +5,"Henry IV",1,1,1,2,"KING HENRY IV","Find we a time for frighted peace to pant, +" +6,"Henry IV",1,1,1,3,"KING HENRY IV","And breathe short-winded accents of new br +oils" +7,"Henry IV",1,1,1,4,"KING HENRY IV","To be commenced in strands afar remote." +8,"Henry IV",1,1,1,5,"KING HENRY IV","No more the thirsty entrance of this soil" +9,"Henry IV",1,1,1,6,"KING HENRY IV","Shall daub her lips with her own children' +s blood," + + Now we have our data, let us search for something: +sqlite> SELECT * FROM plays WHERE text LIKE "whether tis nobler%"; +sqlite> + + That did not work. Hamlet definitely says that, but perhaps the text + formatting is slightly off. SQLite to the rescue. It ships with a Full + Text Search extension compiled in. Let us index all of Shakespeare with + FTS5: +sqlite> CREATE VIRTUAL TABLE playsearch USING fts5(playsrowid, text); +sqlite> INSERT INTO playsearch SELECT rowid, text FROM plays; + + Now we can search for our soliloquy: +sqlite> SELECT rowid, text FROM playsearch WHERE text MATCH "whether tis nobler" +; +34232|Whether 'tis nobler in the mind to suffer + + Success! The act and scene can be acquired by joining with our original + table. +sqlite> SELECT play, act, scene, line, player, plays.text + FROM playsearch + INNER JOIN plays ON playsearch.playsrowid = plays.rowid + WHERE playsearch.text MATCH "whether tis nobler"; +Hamlet|3|1|65|HAMLET|Whether 'tis nobler in the mind to suffer + + Letʼs clean up. +sqlite> DROP TABLE import; +sqlite> VACUUM; + + Finally, what does all of this look like on the file system? +$ ls -l +-rwxr-xr-x@ 1 crawshaw staff 10188854 Apr 27 2017 shakespeare_data.csv +-rw-r--r-- 1 crawshaw staff 22286336 Jul 25 22:05 shakespeare.db + + There you have it. The SQLite database contains two full copies of the + plays of Shakespeare, one with a full text search index, and stores + both of them in about twice the space it takes the original CSV file to + store one. Not bad. + + That should give you a feel for the i-t-e of SQLite. + + And scene. + +Using SQLite from Go + +The standard database/sql + + There are a number of cgo-based [4]database/sql drivers available for + SQLite. The most popular one appears to be + [5]github.com/mattn/go-sqlite3. It gets the job done and is probably + what you want. + + Using the database/sql package it is straightforward to open an SQLite + database and execute SQL statements on it. For example, we can run the + FTS query from earlier using this Go code: +package main + +import ( + "database/sql" + "fmt" + "log" + + _ "github.com/mattn/go-sqlite3" +) + +func main() { + db, err := sql.Open("sqlite3", "shakespeare.db") + if err != nil { + log.Fatal(err) + } + defer db.Close() + stmt, err := db.Prepare(` + SELECT play, act, scene, plays.text + FROM playsearch + INNER JOIN plays ON playsearch.playrowid = plays.rowid + WHERE playsearch.text MATCH ?;`) + if err != nil { + log.Fatal(err) + } + var play, text string + var act, scene int + err = stmt.QueryRow("whether tis nobler").Scan(&play, &act, &scene, &tex +t) + if err != nil { + log.Fatal(err) + } + fmt.Printf("%s %d:%d: %q\n", play, act, scene, text) +} + + Executing it yields: +Hamlet 3:1 "Whether 'tis nobler in the mind to suffer" + +A low-level wrapper: crawshaw.io/sqlite + + Just as SQLite steps beyond the basics of SELECT, INSERT, UPDATE, + DELETE with full-text search, it has several other interesting features + and extensions that cannot be accessed by SQL statements alone. These + need specialized interfaces, and many of the interfaces are not + supported by any of the existing drivers. + + So I wrote my own. You can get it from [6]crawshaw.io/sqlite. In + particular, it supports the streaming blob interface, the [7]session + extension, and implements the necessary sqlite_unlock_notify machinery + to make good use of the [8]shared cache for connection pools. I am + going to cover these features through two use case studies: the client + and the cloud. + +cgo + + All of these approaches rely on cgo for integrating C into Go. This is + straightforward to do, but adds some operational complexity. Building a + Go program using SQLite requires a C compiler for the target. + + In practice, this means if you develop on macOS you need to install a + cross-compiler for linux. + + Typical concerns about the impact on software quality of adding C code + to Go do not apply to SQLite as it has an extraordinary degree of + testing. The quality of the code is exceptional. + +Go and SQLite for the client + + I am building an [9]iOS app, with almost all the code written in Go and + the UI provided by a web view. This app has a full copy of the user + data, it is not a thin view onto an internet server. This means storing + a large amount of local, structured data, on-device full text + searching, background tasks working on the database in a way that does + not disrupt the UI, and syncing DB changes to a backup in the cloud. + + That is a lot of moving parts for a client. More than I want to write + in JavaScript, and more than I want to write in Swift and then have to + promptly rewrite if I ever manage to build an Android app. More + importantly, the server is in Go, and I am one independent developer. + It is absolutely vital I reduce the number of moving pieces in my + development environment to the smallest possible number. Hence the + effort to build (the big bits) of a client using the exact same + technology as my server. + +The Session extension + + The session extension lets you start a session on an SQLite connection. + All changes made to the database through that connection are bundled + into a patchset blob. The extension also provides method for applying + the generated patchset to a table. +func (conn *Conn) CreateSession(db string) (*Session, error) + +func (s *Session) Changeset(w io.Writer) error + +func (conn *Conn) ChangesetApply( + r io.Reader, + filterFn func(tableName string) bool, + conflictFn func(ConflictType, ChangesetIter) ConflictAction, +) error + + This can be used to build a very simple client-sync system. Collect the + changes made in a client, periodically bundle them up into a changeset + and upload it to the server where it is applied to a backup copy of the + database. If another client changes the database then the server + advertises it to the client, who downloads a changeset and applies it. + + This requires a bit of care in the database design. The reason I kept + the FTS table separate in the Shakespeare example is I keep my FTS + tables in a separate attached database (which in SQLite, means a + different file). The cloud backup database never generates the FTS + tables, the client is free to generate the tables in a background + thread and they can lag behind data backups. + + Another point of care is minimizing conflicts. The biggest one is + AUTOINCREMENT keys. By default the primary key of a rowid table is + incremented, which means if you have multiple clients generating rowids + you will see lots of conflicts. + + I have been trialing two different solutions. The first is having each + client register a rowid range with the server and only allocate from + its own range. It works. The second is randomly generating int64 + values, and relying on the low collision rate. So far it works too. + Both strategies have risks, and I havenʼt decided which is better. + + In practice, I have found I have to limit DB updates to a single + connection to keep changeset quality high. (A changeset does not see + changes made on other connections.) To do this I maintain a read-only + pool of connections and a single guarded read-write connection in a + pool of 1. The code only grabs the read-write connection when it needs + it, and the read-only connections are enforced by the read-only bit on + the SQLite connection. + +Nested Transactions + + The database/sql driver encourages the use of SQL transactions with its + Tx type, but this does not appear to play well with nested + transactions. This is a concept implemented by SAVEPOINT / RELEASE in + SQL, and it makes for surprisingly composable code. + + If a function needs to make multiple statements in a transaction, it + can open with a SAVEPOINT, then defer a call to RELEASE if the function + produces no Go return error, or if it does instead call ROLLBACK and + return the error. +func f(conn *sqlite.Conn) (err error) { + conn...SAVEPOINT + defer func() { + if err == nil { + conn...RELEASE + } else { + conn...ROLLBACK + } + }() +} + + Now if this transactional function f needs to call another + transactional function g, then g can use exactly the same strategy and + f can call it in a very traditional Go way: +if err := g(conn); err != nil { + return err // all changes in f will be rolled back by the defer +} + + The function g is also perfectly safe to use in its own right, as it + has its own transaction. + + I have been using this SAVEPOINT + defer RELEASE or return an error + semantics for several months now and find it invaluable. It makes it + easy to safely wrap code in SQL transactions. + + The example above however is a bit bulky, and there are some edge cases + that need to be handled. (For example, if the RELEASE fails, then an + error needs to be returned.) So I have wrapped this up in a utility: +func f(conn *sqlite.Conn) (err error) { + defer sqlitex.Save(conn)(&err) + + // Code is transactional and can be stacked + // with other functions that call sqlitex.Save. +} + + The first time you see sqlitex.Save in action it can be a little + off-putting, at least it was for me when I first created it. But I + quickly got used to it, and it does a lot of heavy lifting. The first + call to sqlitex.Save opens a SAVEPOINT on the conn and returns a + closure that either RELEASEs or ROLLBACKs depending on the value of + err, and sets err if necessary. + +Go and SQLite in the cloud + + I have spent several months now redesigning services I have encountered + before and designing services for problems I would like to work on + going forward. The process has led me to a general design that works + for many problems and I quite enjoy building. + + It can be summarized as 1 VM, 1 Zone, 1 process programming. + + If this sounds ridiculously simplistic to you, I think thatʼs good! It + is simple. It does not meet all sorts of requirements that we would + like our modern fancy cloud services to meet. It is not "serverless", + which means when a service is extremely small it does not run for free, + and when a service grows it does not automatically scale. Indeed, there + is an explicit scaling limit. Right now the best server you can get + from Amazon is roughly: + * 128 CPU threads at ~4GHz + * 4TB RAM + * 25 Gbit ethernet + * 10 Gbps NAS + * hours of yearly downtime + + That is a huge potential downside of of one process programming. + However, I claim that is a livable limit. + + I claim typical services do not hit this scaling limit. + + If you are building a small business, most products can grow and become + profitable well under this limit for years. When you see the limit + approaching in the next year or two, you have a business with revenue + to hire more than one engineer, and the new team can, in the face of + radically changing business requirements, rewrite the service. + + Reaching this limit is a good problem to have because when it comes you + will have plenty of time to deal with it and the human resources you + need to solve it well. + + Early in the life of a small business you donʼt, and every hour you + spend trying to work beyond this scaling limit is an hour that would + have been better spent talking to your customers about their needs. + + The principle at work here is: + + Donʼt use N computers when 1 will do. + + To go into a bit more technical detail, + + I run a single VM on AWS, in a single availability zone. The VM has + three EBS volumes (this is Amazon name for NAS). The first holds the + OS, logs, temporary files, and any ephemeral SQLite databases that are + generated from the main databases, e.g. FTS tables. The second the + primary SQLite database for the main service. The third holds the + customer sync SQLite databases. + + The system is configured to periodically snapshot the system EBS volume + and the customer EBS volumes to S3, the Amazon geo-redundant blob + store. This is a relatively cheap operation that can be scripted, + because only blocks that change are copied. + + The main EBS volume is backed up to S3 very regularly, by custom code + that flushes the WAL cache. Iʼll explain that in a bit. + + The service is a single Go binary running on this VM. The machine has + plenty of extra RAM that is used by linuxʼs disk cache. (And that can + be used by a second copy of the service spinning up for low down-time + replacement.) + + The result of this is a service that has at most tens of hours of + downtime a year, about as much change of suffering block loss as a + physical computer with a RAID5 array, and active offsite backups being + made every few minutes to a distributed system that is built and + maintained by a large team. + + This system is astonishingly simple. I shell into one machine. It is a + linux machine. I have a deploy script for the service that is ten lines + long. Almost all of my performance work is done with pprof. + + On a medium sized VM I can clock 5-6 thousand concurrent requests with + only a few hours of performance tuning. On the largest machine AWS has, + tens of thousands. + + Now to talk a little more about the particulars of the stack: + +Shared cache and WAL + + To make the server extremely concurrent there are two important SQLite + features I use. The first is the shared cache, which lets me allocate + one large pool of memory to the database page cache and many concurrent + connections can use it simultaneously. This requires some support in + the driver for sqlite_unlock_notify so user code doesnʼt need to deal + with locking events, but that is transparent to end user code. + + The second is the Write Ahead Log. This is a mode SQLite can be knocked + into at the beginning of connection which changes the way it writes + transactions to disk. Instead of locking the database and making + modifications along with a rollback journal, it appends the new change + to a separate file. This allows readers to work concurrently with the + writer. The WAL has to be flushed periodically by SQLite, which + involves locking the database and writing the changes from it. There + are default settings for doing this. + + I override these and execute WAL flushes manually from a package that, + when it is done, also triggers an S3 snapshot. This package is called + reallyfsync, and if I can work out how to test it properly I will make + it open source. + +Incremental Blob API + + Another smaller, but important to my particular server feature, is + SQLiteʼs [10]incremental blob API. This allows a field of bytes to be + read and written in the DB without storing all the bytes in memory + simultaneously, which matters when it is possible for each request to + be working with hundreds of megabytes, but you want tens of thousands + of potential concurrent requests. + + This is one of the places where the driver deviates from being a + close-to-cgo wrapper to be more [11]Go-like: +type Blob + func (blob *Blob) Close() error + func (blob *Blob) Read(p []byte) (n int, err error) + func (blob *Blob) ReadAt(p []byte, off int64) (n int, err error) + func (blob *Blob) Seek(offset int64, whence int) (int64, error) + func (blob *Blob) Size() int64 + func (blob *Blob) Write(p []byte) (n int, err error) + func (blob *Blob) WriteAt(p []byte, off int64) (n int, err error) + + This looks a lot like a file, and indeed can be used like a file, with + one caveat: the size of a blob is set when it is created. (As such, I + still find temporary files to be useful.) + +Designing with one process programming + + I start with: Do you really need N computers? + + Some problems really do. For example, you cannot build a low-latency + index of the public internet with only 4TB of RAM. You need a lot more. + These problems are great fun, and we like to talk a lot about them, but + they are a relatively small amount of all the code written. So far all + the projects I have been developing post-Google fit on 1 computer. + + There are also more common sub-problems that are hard to solve with one + computer. If you have a global customer base and need low-latency to + your server, the speed of light gets in the way. But many of these + problems can be solved with relatively straightforward CDN products. + + Another great solution to the speed of light is geo-sharding. Have + complete and independent copies of your service in multiple + datacenters, move your userʼs data to the service near them. This can + be as easy as having one small global redirect database (maybe SQLite + on geo-redundant NFS!) redirecting the user to a specific DNS name like + {us-east, us-west}.mservice.com. + + Most problems do fit in one computer, up to a point. Spend some time + determining where that point is. If it is years away there is a good + chance one computer will do. + +Indie dev techniques for the corporate programmer + + Even if you do not write code in this particular technology stack and + you are not an independent developer, there is value here. Use the one + big VM, one zone, one process Go, SQLite, and snapshot backup stack as + a hypothetical tool to test your designs. + + So add a hypothetical step to your design process: If you solved your + problem on this stack with one computers, how far could you get? How + many customers could you support? At what size would you need to + rewrite your software? + + If this indie mini stack would last your business years, you might want + to consider delaying the adoption of modern cloud software. + + If you are a programmer at a well-capitalized company, you may also + want to consider what development looks like for small internal or + experimental projects. Do your coworkers have to use large complex + distributed systems for policy reasons? Many of these projects will + never need to scale beyond one computer, or if they do they will need a + rewrite to deal with shifting requirements. In which case, find a way + to make an indie stack, linux VMs with a file system, available for + prototyping and experimentation. + __________________________________________________________________ + + [12]Index + [13]github.com/crawshaw + [14]twitter.com/davidcrawshaw + david@zentus.com + +References + + 1. file:///atom.xml + 2. https://gonorthwest.io/ + 3. https://www.kaggle.com/kingburrito666/shakespeare-plays + 4. https://golang.org/pkg/database/sql + 5. https://github.com/mattn/go-sqlite3 + 6. https://crawshaw.io/sqlite + 7. https://www.sqlite.org/sessionintro.html + 8. https://www.sqlite.org/sharedcache.html + 9. https://www.posticulous.com/ + 10. https://www.sqlite.org/c3ref/blob_open.html + 11. https://godoc.org/crawshaw.io/sqlite#Blob + 12. file:/// + 13. https://github.com/crawshaw + 14. https://twitter.com/davidcrawshaw