add links

2025-03-02 01:14:03 -05:00
parent 21c554c89b
commit 56c023648e
9 changed files with 3274 additions and 6 deletions
--- a/static/archive/nshipster-com-b3vpys.txt
+++ b/static/archive/nshipster-com-b3vpys.txt
@@ -0,0 +1,358 @@
+[1]
+
+[2]Ollama
+
+Written by [3]Mattt February 14^th, 2025
+
+
+    “Only Apple can do this” Variously attributed to Tim Cook
+
+Apple introduced [4]Apple Intelligence at WWDC 2024. After waiting almost a
+year for Apple to, in Craig Federighi’s words, “get it right”, its promise of
+“AI for the rest of us” feels just as distant as ever.
+
+Can we take a moment to appreciate the name? Apple Intelligence. AI. That’s
+some S-tier semantic appropriation. On the level of jumping on “podcast” before
+anyone knew what else to call that.
+
+While we wait for Apple Intelligence to arrive on our devices, something
+remarkable is already running on our Macs. Think of it as a locavore approach
+to artificial intelligence: homegrown, sustainable, and available year-round.
+
+This week on NSHipster, we’ll look at how you can use Ollama to run LLMs
+locally on your Mac — both as an end-user and as a developer.
+
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+[5]What is Ollama?
+
+Ollama is the easiest way to run large language models on your Mac. You can
+think of it as “Docker for LLMs” - a way to pull, run, and manage AI models as
+easily as containers.
+
+Download Ollama with [6]Homebrew or directly from [7]their website. Then pull
+and run [8]llama3.2 (2GB).
+
+$ brew install --cask ollama
+$ ollama run llama3.2
+>>> Tell me a joke about Swift programming.
+What's a Apple developer's favorite drink?
+The Kool-Aid.
+
+Under the hood, Ollama is powered by [9]llama.cpp. But where llama.cpp provides
+the engine, Ollama gives you a vehicle you’d actually want to drive — handling
+all the complexity of model management, optimization, and inference.
+
+Similar to how Dockerfiles define container images, Ollama uses Modelfiles to
+configure model behavior:
+
+FROM mistral:latest
+PARAMETER temperature 0.7
+TEMPLATE """
+You are a helpful assistant.
+
+User: 
+Assistant: """
+
+Ollama uses the [10]Open Container Initiative (OCI) standard to distribute
+models. Each model is split into layers and described by a manifest, the same
+approach used by Docker containers:
+
+{
+  "mediaType": "application/vnd.oci.image.manifest.v1+json",
+  "config": {
+    "mediaType": "application/vnd.ollama.image.config.v1+json",
+    "digest": "sha256:..."
+  },
+  "layers": [
+    {
+      "mediaType": "application/vnd.ollama.image.layer.v1+json",
+      "digest": "sha256:...",
+      "size": 4019248935
+    }
+  ]
+}
+
+Overall, Ollama’s approach is thoughtful and well-engineered. And best of all,
+it just works.
+
+[11]What’s the big deal about running models locally?
+
+[12]Jevons paradox states that, as something becomes more efficient, we tend to
+use more of it, not less.
+
+Having AI on your own device changes everything. When computation becomes
+essentially free, you start to see intelligence differently.
+
+While frontier models like GPT-4 and Claude are undeniably miraculous, there’s
+something to be said for the small miracle of running open models locally.
+
+  • Privacy: Your data never leaves your device. Essential for working with
+    sensitive information.
+  • Cost: Run 24/7 without usage meters ticking. No more rationing prompts like
+    ’90s cell phone minutes. Just a fixed, up-front cost for unlimited
+    inference.
+  • Latency: No network round-trips means faster responses. Your /M\d Mac((Book
+    ( Pro| Air)?)|Mini|Studio)/ can easily generate dozens of tokens per
+    second. (Try to keep up!)
+  • Control: No black-box [13]RLHF or censorship. The AI works for you, not the
+    other way around.
+  • Reliability: No outages or API quota limits. 100% uptime for your [14]
+    exocortex. Like having Wikipedia on a thumb drive.
+
+[15]Building macOS Apps with Ollama
+
+Ollama also exposes an [16]HTTP API on port 11431 ([17]leetspeak for llama 🦙).
+This makes it easy to integrate with any programming language or tool.
+
+To that end, we’ve created the [18]Ollama Swift package to help developers
+integrate Ollama into their apps.
+
+[19]Text Completions
+
+The simplest way to use a language model is to generate text from a prompt:
+
+import Ollama
+
+let client = Client.default
+let response = try await client.generate(
+    model: "llama3.2",
+    prompt: "Tell me a joke about Swift programming.",
+    options: ["temperature": 0.7]
+)
+print(response.response)
+// How many Apple engineers does it take to document an API? 
+// None - that's what WWDC videos are for.
+
+[20]Chat Completions
+
+For more structured interactions, you can use the chat API to maintain a
+conversation with multiple messages and different roles:
+
+let initialResponse = try await client.chat(
+    model: "llama3.2",
+    messages: [
+        .system("You are a helpful assistant."),
+        .user("What city is Apple located in?")
+    ]
+)
+print(initialResponse.message.content)
+// Apple's headquarters, known as the Apple Park campus, is located in Cupertino, California.
+// The company was originally founded in Los Altos, California, and later moved to Cupertino in 1997.
+
+let followUp = try await client.chat(
+    model: "llama3.2",
+    messages: [
+        .system("You are a helpful assistant."),
+        .user("What city is Apple located in?"),
+        .assistant(initialResponse.message.content),
+        .user("Please summarize in a single word")
+    ]
+)
+print(followUp.message.content)
+// Cupertino
+
+[21]Generating text embeddings
+
+[22]Embeddings convert text into high-dimensional vectors that capture semantic
+meaning. These vectors can be used to find similar content or perform semantic
+search.
+
+For example, if you wanted to find documents similar to a user’s query:
+
+let documents: [String] = …
+
+// Convert text into vectors we can compare for similarity
+let embeddings = try await client.embeddings(
+    model: "nomic-embed-text",
+    texts: documents
+)
+
+/// Finds relevant documents
+func findRelevantDocuments(
+    for query: String,
+    threshold: Float = 0.7, // cutoff for matching, tunable
+    limit: Int = 5
+) async throws -> [String] {
+    // Get embedding for the query
+    let [queryEmbedding] = try await client.embeddings(
+        model: "llama3.2",
+        texts: [query]
+    )
+
+    // See: https://en.wikipedia.org/wiki/Cosine_similarity
+    func cosineSimilarity(_ a: [Float], _ b: [Float]) -> Float {
+        let dotProduct = zip(a, b).map(*).reduce(0, +)
+        let magnitude = { sqrt($0.map { $0 * $0 }.reduce(0, +)) }
+        return dotProduct / (magnitude(a) * magnitude(b))
+    }
+
+    // Find documents above similarity threshold
+    let rankedDocuments = zip(embeddings, documents)
+        .map { embedding, document in
+            (similarity: cosineSimilarity(embedding, queryEmbedding),
+             document: document)
+        }
+        .filter { $0.similarity >= threshold }
+        .sorted { $0.similarity > $1.similarity }
+        .prefix(limit)
+
+    return rankedDocuments.map(\.document)
+}
+
+For simple use cases, you can also use Apple’s [23]Natural Language framework
+for text embeddings. They’re fast and don’t require additional dependencies.
+
+import NaturalLanguage
+
+let embedding = NLEmbedding.wordEmbedding(for: .english)
+let vector = embedding?.vector(for: "swift")
+
+[24]Building a RAG System
+
+Embeddings really shine when combined with text generation in a RAG (Retrieval
+Augmented Generation) workflow. Instead of asking the model to generate
+information from its training data, we can ground its responses in our own
+documents by:
+
+ 1. Converting documents into embeddings
+ 2. Finding relevant documents based on the query
+ 3. Using those documents as context for generation
+
+Here’s a simple example:
+
+let query = "What were AAPL's earnings in Q3 2024?"
+let relevantDocs = try await findRelevantDocuments(query: query)
+let context = """
+    Use the following documents to answer the question.
+    If the answer isn't contained in the documents, say so.
+
+    Documents:
+    \(relevantDocs.joined(separator: "\n---\n"))
+
+    Question: \(query)
+    """
+
+let response = try await client.generate(
+    model: "llama3.2",
+    prompt: context
+)
+
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+To summarize: Different models have different capabilities.
+
+  • Models like [25]llama3.2 and [26]deepseek-r1 generate text.
+      □ Some text models have “base” or “instruct” variants, suitable for
+        fine-tuning or chat completion, respectively.
+      □ Some text models are tuned to support [27]tool use, which let them
+        perform more complex tasks and interact with the outside world.
+  • Models like [28]llama3.2-vision can take images along with text as inputs.
+
+  • Models like [29]nomic-embed-text create numerical vectors that capture
+    semantic meaning.
+
+With Ollama, you get unlimited access to a wealth of these and many more
+open-source language models.
+
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+So, what can you build with all of this?
+Here’s just one example:
+
+[30]Nominate.app
+
+[31]Nominate is a macOS app that uses Ollama to intelligently rename PDF files
+based on their contents.
+
+Like many of us striving for a paperless lifestyle, you might find yourself
+scanning documents only to end up with cryptically-named PDFs like
+Scan2025-02-03_123456.pdf. Nominate solves this by combining AI with
+traditional NLP techniques to automatically generate descriptive filenames
+based on document contents.
+
+
+The app leverages several technologies we’ve discussed:
+
+  • Ollama’s API for content analysis via the ollama-swift package
+  • Apple’s PDFKit for OCR
+  • The Natural Language framework for text processing
+  • Foundation’s DateFormatter for parsing dates
+
+Nominate performs all processing locally. Your documents never leave your
+computer. This is a key advantage of running models locally versus using cloud
+APIs.
+
+[32]Looking Ahead
+
+    “The future is already here – it’s just not evenly distributed yet.”
+    William Gibson
+
+Think about the timelines:
+
+  • Apple Intelligence was announced last year.
+  • Swift came out 10 years ago.
+  • SwiftUI 6 years ago.
+
+If you wait for Apple to deliver on its promises, you’re going to miss out on
+the most important technological shift in a generation.
+
+The future is here today. You don’t have to wait. With Ollama, you can start
+building the next generation of AI-powered apps right now.
+
+NSMutableHipster
+
+Questions? Corrections? [33]Issues and [34]pull requests are always welcome.
+
+This article uses Swift version 6.0. Find status information for all articles
+on the [35]status page.
+
+Written by Mattt
+[36]Mattt
+
+[37]Mattt ([38]@mattt) is a writer and developer in Portland, Oregon.
+
+🅭 🅯 🄏 NSHipster.com is released under a [39]Creative Commons BY-NC License.
+
+
+References:
+
+[1] https://nshipster.com/
+[2] https://nshipster.com/ollama/
+[3] https://nshipster.com/authors/mattt/
+[4] https://www.apple.com/apple-intelligence/
+[5] https://nshipster.com/ollama/#what-is-ollama
+[6] https://brew.sh/
+[7] https://ollama.com/download
+[8] https://ollama.com/library/llama3.2
+[9] https://github.com/ggerganov/llama.cpp
+[10] https://opencontainers.org/
+[11] https://nshipster.com/ollama/#whats-the-big-deal-about-running-models-locally
+[12] https://en.wikipedia.org/wiki/Jevons_paradox
+[13] https://knowyourmeme.com/photos/2546581-shoggoth-with-smiley-face-artificial-intelligence
+[14] https://en.wiktionary.org/wiki/exocortex
+[15] https://nshipster.com/ollama/#building-macos-apps-with-ollama
+[16] https://github.com/ollama/ollama/blob/main/docs/api.md
+[17] https://en.wikipedia.org/wiki/Leet
+[18] https://github.com/mattt/ollama-swift
+[19] https://nshipster.com/ollama/#text-completions
+[20] https://nshipster.com/ollama/#chat-completions
+[21] https://nshipster.com/ollama/#generating-text-embeddings
+[22] https://en.wikipedia.org/wiki/Word_embedding
+[23] https://developer.apple.com/documentation/naturallanguage/
+[24] https://nshipster.com/ollama/#building-a-rag-system
+[25] https://ollama.com/library/llama3.2
+[26] https://ollama.com/library/deepseek-r1
+[27] https://ollama.com/blog/tool-support
+[28] https://ollama.com/library/llama3.2-vision
+[29] https://ollama.com/library/nomic-embed-text
+[30] https://nshipster.com/ollama/#nominateapp
+[31] https://github.com/nshipster/nominate
+[32] https://nshipster.com/ollama/#looking-ahead
+[33] https://github.com/NSHipster/articles/issues
+[34] https://github.com/NSHipster/articles/blob/master/2025-02-14-ollama.md
+[35] https://nshipster.com/status/
+[36] https://nshipster.com/authors/mattt/
+[37] https://github.com/mattt
+[38] https://twitter.com/mattt
+[39] https://creativecommons.org/licenses/by-nc/4.0/