Flight School

Training a Text Classifier with Create ML and the Natural Language Framework

2018-06-13T00:00:00+00:00

Machine Learning can be difficult to get your head around as a programmer. But aside from all the advanced mathematics and tooling, perhaps most difficult of all is learning how to let go.

Many of us have spent years honing our craft of writing code: expressive, type-safe, unit-tested, refactored, clean, DRY code. So it’s tough to hear that pretty much anyone with enough data (and patience) can use machine learning to solve hard problems without understanding why or how it works.

But this isn’t news to you. In fact, ML is at the top of your list of things to learn next! You have Andrew Ng’s course bookmarked in Safari and a load of unread PDFs littering your Downloads folder. All you need is a free weekend and…

If that strikes a bit close to home for you then you’ll love CreateML.

Create ML is a new framework that makes it easy to train machine learning models. How easy? Drag pictures of dogs into a “Dogs” folder and pictures of cats into a “Cats” folder, write a few lines of Swift code, wait a couple minutes to train the model and boom: you have a image classifier that can tell you whether a picture contains a cat or a dog. You can even do this for text classification or regression for data tables.

If you haven’t already, go ahead and watch the Introducing Create ML session from this year’s WWDC. It’s like magic.

Pulling files from labeled directories makes for a nice demo, but what if your data set isn’t so nicely organized? This article shows how you can use Create ML to train a text classifier that predicts the programming language of unknown source code by manually creating the corpus from a heterogeneous data set.

You can find the complete training script and a demo playground here.

Acquiring a Corpus of Data

First things first: we need some examples of source code.

If you’re anything like the author, you might have thought to use GitHub search to find code by language. And if so, you’d eventually realize that the both the GitHub API (both the REST and GraphQL versions) requires code search results to be scoped by user or project. At that point, you’d probably wonder why you decided you needed to make this yourself before finding something off-the-shelf, like this project by Source Foundry.

Our corpus includes labeled code samples in C, C++, Go, Java, JavaScript, Objective-C, PHP, Python, Ruby, Rust, and Swift. Each directory in the project root corresponds to a language and contains flattened checkouts of a handful of popular open source repositories for that language:

        $ tree -L 2 code-corpora/swift
        code-corpora/swift
        ├── alamofire
        │   ├── Alamofire.h
        │   ├── Alamofire.swift
        │   ├── AppDelegate.swift
        │   ├── AuthenticationTests.swift
        # ...
        
        

Notice that Objective-C header in the Swift project, though. We can’t rely entirely on the top-level directories as labels for their contents, because most projects include other auxillary scripts and source files (as well as README, LICENSE, and other repository miscellany).

Our training script uses the containing directory and file extension to determine the correct label for each file in our corpus. For example, .h files in the c directory are labeled as C, .h files in the cc directory are labeled as C++, and any file with the .go extension is labeled as Go:

        switch (directory, fileExtension) {
        case ("c", "h"), (_, "c"): label = "C"
        case ("cc", "h"), (_, "cc"), (_, "cpp"): label = "C++"
        case (_, "go"): label = "Go"
        // ...
        default:
        // Unknown, skip
        }
        
        

Training the Model

To build our data table, we recursively enumerate the contents of the corpus directory and append the contents of each source file that we can identify:

        var corpus: [(text: String, label: String)] = []
        let enumerator = FileManager.default.enumerator(
        at: corpusURL,
        includingPropertiesForKeys: [.isDirectoryKey]
        )!
        for case let resource as URL in enumerator {
        guard !resource.hasDirectoryPath,
        let language = ProgrammingLanguage(for: resource,
        at: enumerator.level),
        let text = try? String(contentsOf: resource)
        else {
        continue
        }
        corpus.append((text: text, label: language.rawValue))
        }
        let (texts, labels): ([String], [String]) =
        corpus.reduce(into: ([], [])) { (columns, row)
        columns.0.append(row.text)
        columns.1.append(row.label)
        }
        let dataTable =
        try MLDataTable(dictionary: ["text": texts, "label": labels])
        
        

Our original implementation appended MLDataTable objects, instead of initializing a single data table from an accumulated array. We found this to have nonlinear performance characteristics, which caused training to take closer to an hour instead of a few minutes.

With our data table in hand, we use the randomSplit(by:seed:) method to segment our training and testing data. The former is used immediately, passed into the MLTextClassifier initializer; the latter will be used next to evaluate the model.

        let (trainingData, testingData) =
        dataTable.randomSplit(by: 0.8, seed: 0)
        let classifier = try MLTextClassifier(trainingData: trainingData,
        textColumn: "text",
        labelColumn: "label")
        
        

Creating an MLTextClassifier object takes a while, but you can track the progress by tailing STDOUT:

        Automatically generating validation set from 5% of the data.
        Tokenizing data and extracting features
        10% complete
        20% complete
        30% complete
        40% complete
        50% complete
        60% complete
        70% complete
        80% complete
        90% complete
        100% complete
        Starting MaxEnt training with 8584 samples
        Iteration 1 training accuracy 0.285182
        Iteration 2 training accuracy 0.946295
        Iteration 3 training accuracy 0.988001
        Iteration 4 training accuracy 0.997554
        Iteration 5 training accuracy 0.998602
        Iteration 6 training accuracy 0.999185
        Iteration 7 training accuracy 0.999651
        Iteration 8 training accuracy 0.999767
        Finished MaxEnt training in 7.12 seconds
        
        

The resulting model seems large for what it can do, weighing in at 3MB. However, it’s able to classify a file in ~20ms, which should be fast enough for most use cases.

Evaluating the Model

Let’s see how our classifier performs by calling the evaluation(on:) method and passing the testingData that we segmented before.

        let evaluation = classifier.evaluation(on: testingData)
        print(evaluation)

Accuracy

At the top of our evaluation, we get a summary with the number of examples, the number of classes, and the accuracy:

Number of Examples	1138
Number of Classes	10
Accuracy	99.56%

99.56% accuracy. That’s good, right? Let’s dig into the numbers to get a better understanding of how this behaves.

When you print(_:) an MLClassifierMetrics object, it shows a summary of the overall accuracy as well as a confusion matrix and a precision / recall table.

Confusion Matrix

A confusion matrix is a tool for visualizing the accuracy of predictions. Each column shows the predicted classes, and each row shows the actual class:

	C	C++	Go	Java	JS	Obj-C	PHP	Ruby	Rust	Swift
C	122	0	0	0	0	0	0	0	0	0
C++	0	73	0	0	0	2	0	0	0	0
Go	1	0	333	0	0	0	0	0	0	0
Java	0	0	0	137	0	0	0	0	0	0
JS	0	0	0	0	55	0	0	0	0	0
Obj-C	0	0	0	0	0	97	0	0	0	0
PHP	0	0	0	0	0	0	95	0	0	0
Ruby	0	0	0	0	0	0	0	136	0	0
Rust	0	0	0	0	0	0	0	0	73	0
Swift	0	0	0	0	0	0	0	0	0	12

100% accuracy would have values along the diagonal line where the predicted and actual classes match, and zeroes everywhere else. However, our accuracy isn’t perfect, so we have a few stray figures. From the table, we can see that Go was mistaken for C once and C++ was incorrectly labeled as Objective-C twice.

Precision and Recall

Another way of analyzing our results is in terms of precision and recall.

Class	Precision(%)	Recall(%)
C	99.19	100.00
C++	100.00	97.33
Go	100.00	98.94
Java	100.00	100.00
JavaScript	100.00	100.00
Objective-C	98.91	100.00
PHP	100.00	100.00
Ruby	100.00	100.00
Rust	100.00	100.00
Swift	100.00	100.00

Precision measures the ability of the model to identify only the relevant classification within a data set. For example, our model had perfect precision for C++ because it never misidentified any source files as being C++, however it has imperfect precision for C because it incorrectly identified a Go file as being C.

Recall measures the ability of a model to identify all of the relevant classifications within a data set. For example, our model had perfect recall for C because it correctly identified all of the C source code in the training data, and imperfect recall for C++ because it missed two C++ files in the training data.

Writing the Model to Disk

So, we have our classifier, we’ve evaluated it and found it to be satisfactory. The only thing left to do is to is write it to disk:

        let modelPath = URL(fileURLWithPath: destinationPath)
        let metadata = MLModelMetadata(
        author: "Mattt",
        shortDescription: "A model trained to classify programming languages",
        version: "1.0"
        )
        try classifier.write(to: modelPath, metadata: metadata)
        
        

All told, training, evaluating, and writing the model took less than 5 minutes:

        $ time swift ./Trainer.swift
        281.84 real       275.51 user         5.60 sys

Testing Out the Model in a Playground

In order to use our model from a Playground, we need to compile it first. For an iOS or Mac app, Xcode would automatically generate a programmatic interface for us. However, in a Playground, we need to do this ourselves.

Call the coremlc tool using the xcrun command, specifying the compile action on the .mlmodel file and target the current directory for the output:

$ xcrun coremlc compile ProgrammingLanguageClassifier.mlmodel .

Take the resulting .mlmodelc bundle (it’ll look like a normal directory in Finder) and move it into the Resources folder of your playground. You can use this to initialize a Natural Language framework NLModel to classify text using the predictedLabel(for:) method:

        let url = Bundle.main.url(
        forResource: "ProgrammingLanguageClassifier",
        withExtension: "mlmodelc"
        )!
        let model = try! NLModel(contentsOf: url)
        
        

Now you can call the predictedLabel(for:) method to predict the programming language of a string containing code:

        let code = """
        struct Plane: Codable {
        var manufacturer: String
        var model: String
        var seats: Int
        }
        """
        model.predictedLabel(for: code) // Swift
        
        

The sample code project for this post wraps this up with a fun drag-and-drop UI, so you can easily test out your model with whatever source files you have littering your Desktop.

Conclusion

There’s no way this actually works… right? Source code isn’t like other kinds of text, and weighing keywords and punctuation equally with comments and variable names is obviously a flawed approach.

The way classifiers work, our model may well be fixating on irrelevant details like license comments in the file header. Heck, that 99% accuracy we saw could be more a reflection of file similarity within the same project than of the model’s actual predictive ability.

All of that said, it might just be good enough.

Consider this: in under an hour, we went from nothing to a working solution without any significant programming. That’s pretty incredible.

Create ML is a powerful way to prototype new features quickly. If a minimum-viable product is good enough, then your job is done. Or if you need to go even further, there are all kinds of optimizations to be had in terms of model size, accuracy, and precision by using something like TensorFlow or Turi Create.

Flight School for Chinese Language Speakers

2018-05-17T00:00:00+00:00

Flight School is partnering with 掘金 (Juéjīn), one of the leading developer communities in China, to publish our books for Chinese language speakers.

The Chinese language edition of Flight School Guide to Swift Codable is now available. The book was translated by Croath Liu (@croath), Flight School’s content lead for China.

Our mission is to provide thoughtful learning resources for developers around the world — and we can’t think of a better place to start than China.

As Tim Cook noted in a recent interview:

China has extraordinary skills. And the part that’s the most unknown is there’s almost 2 million application developers in China that write apps for the iOS App Store. These are some of the most innovative mobile apps in the world, and the entrepreneurs that run them are some of the most inspiring and entrepreneurial in the world.

Tim Cook

We couldn’t be more excited to connect with the millions of app developers in China.

To kick things off, Mattt and Croath will be attending the GMTC 2018 conference in Beijing on June 21st & 22nd. We invite you to reach out via Twitter (@flightdotschool) or email if you’d like to meet up!

此致
敬礼

Running Xcode Playgrounds on Travis CI

2018-05-16T00:00:00+00:00

Xcode Playgrounds are a great way to share sample code. They allow you to communicate ideas effectively without getting bogged down in implementation details. The question is: how do you ensure that things continue to work with each new version of Swift and platform SDKs?

This is something we’ve been thinking about since the release of our Guide to Swift Codable. We wanted to release the sample code Playgrounds as open source on GitHub, but not without some kind of testing strategy in place first (with over a dozen Playgrounds in total, doing this manually was out of the question).

As a baseline, our goal was to create a continuous integration that tested whether our code compiled and ran without any issues. (From there, we can progressively add more comprehensive tests for expected output and behavior.)

Each chapter directory contains one or more .playground bundles, each of which contains a Contents.swift file (this is what you first see when you open a Playground with Xcode) as well as any auxiliary sources.

            $ tree "Chapter 2"
            Chapter\ 2
            └── Flight\ Plan.playground
            ├── Contents.swift
            ├── Sources
            │   ├── Aircraft.swift
            │   ├── FlightPlan.swift
            │   └── FlightRules.swift
            └── contents.xcplayground
            
            

Compiling a Playground from Scratch

Without an Xcode project or Swift package manifest, we can’t directly hook into familiar solutions for testing apps or libraries. However, we can reasonably approximate how Xcode builds playgrounds by invoking swiftc directly (well, almost directly — we’ll call it via xcrun).

First, we cd into the .playground bundle.

            $ cd "Chapter 1/Plane.playground"
            
            

Next, we use swiftc to build an AuxiliarySources module from the Swift files in the Sources/ directory.

            $ xcrun swiftc -emit-library \
            -emit-module -module-name AuxiliarySources \
            Sources/*.swift
            
            

Some playgrounds depend on Playground-specific functionality in order to run. We can use the swiftc command’s -emit-imported-modules option to detect whether PlaygroundSupport is imported, and only attempt to build and run the Playground if it isn’t.

            $ if ! xcrun swiftc -emit-imported-modules Contents.swift |     \
            grep -q "PlaygroundSupport";                          \
            then                                                          \
            ...                                                         \
            fi;
            
            

When running a Playground in Xcode, the shared sources module is imported automatically. We can add a missing import statement by concatenating it with Contents.swift and then writing the output to a new main.swift file.

            cat <(echo "import AuxiliarySources")                     \
            Contents.swift >  main.swift &&                       \
            xcrun -sdk "${SDK}"                                       \
            swiftc -target "${TARGET}" -emit-executable           \
            -I "." -L "." -lAuxiliarySources                  \
            -module-link-name AuxiliarySources                \
            -o Playground main.swift                              \
            
            

Building Across All Playgrounds

We can get a list of all the .playground bundles by running the following command from the root directory:

            $ find . -name Chapter -prune -o -name '*.playground' -print | sort
            ./Chapter\ 1/Plane.playground
            ./Chapter\ 2/Flight Plan.playground
            ./Chapter\ 3/AnyDecodable.playground
            ./Chapter\ 3/Coordinates.playground
            ./Chapter\ 3/EconomySeat.playground
            ./Chapter\ 3/EitherBirdOrPlane.playground
            ./Chapter\ 3/FuelPrice.playground
            ./Chapter\ 3/Pixel.playground
            ./Chapter\ 3/Route.playground
            ./Chapter\ 4/Music Store.playground
            ./Chapter\ 5/In Flight Service.playground
            ./Chapter\ 6/Luggage Scanner.playground
            ./Chapter\ 7/MessagePackEncoder.playground
            
            

This list can be fed into the continuous integration system as environment variables, which are set on individual build jobs:

            $ export PLAYGROUND_DIR="Chapter 1/Plane.playground"
            $ cd "${PLAYGROUND_DIR}" && # compile playground

Putting it All Together

Now that we can compile Playground files locally and know how to repeat the process across our sample code, it’s time to write our .travis.yml file and kick off a test build:

.travis.yml

            language: swift
            osx_image: xcode9.3
            env:
            global:
            - SDK=iphoneos
            - TARGET=armv7-apple-ios10
            matrix:
            - PLAYGROUND_DIR="Chapter 1/Plane.playground"
            - PLAYGROUND_DIR="Chapter 2/Flight Plan.playground"
            - PLAYGROUND_DIR="Chapter 3/AnyDecodable.playground"
            - PLAYGROUND_DIR="Chapter 3/Coordinates.playground"
            - PLAYGROUND_DIR="Chapter 3/EconomySeat.playground"
            - PLAYGROUND_DIR="Chapter 3/EitherBirdOrPlane.playground"
            - PLAYGROUND_DIR="Chapter 3/FuelPrice.playground"
            - PLAYGROUND_DIR="Chapter 3/Pixel.playground"
            - PLAYGROUND_DIR="Chapter 3/Route.playground"
            - PLAYGROUND_DIR="Chapter 4/Music Store.playground"
            - PLAYGROUND_DIR="Chapter 5/In Flight Service.playground"
            - PLAYGROUND_DIR="Chapter 6/Luggage Scanner.playground"
            - PLAYGROUND_DIR="Chapter 7/MessagePackEncoder.playground"
            script: xcrun swift --version &&
            cd "${PLAYGROUND_DIR}" &&
            xcrun -sdk "${SDK}"
            swiftc -target "${TARGET}"
            -emit-library -emit-module -module-name AuxiliarySources
            Sources/*.swift &&
            if ! xcrun swiftc -emit-imported-modules Contents.swift |
            grep -q "PlaygroundSupport";
            then
            cat <(echo "import AuxiliarySources") Contents.swift > main.swift &&
            xcrun -sdk "${SDK}"
            swiftc -target "${TARGET}"
            -I "." -L "." -lAuxiliarySources -module-link-name AuxiliarySources
            -o Playground main.swift;
            fi
            
            

Global environment variables are used to declare constants for the SDK and target. Together, they allow us to quickly tell (and later change) what platform our builds are targeting.

Each Playground directory is specified as a separate environment variable, forming a Build Matrix with separate jobs for each one.

Newlines and indentation help break up the long script command into more meaningful chunks. Everything works out in the end because
YAML Plain Style helpfully strips out the extra whitespace when being parsed.

With CI all set up and ready to go, we’re thrilled to share the sample code for our first book.

Feel free to Clone, Star, and Fork to your heart’s content! And if you have any ideas for how to improve our setup even further, please open an issue or reach out via Twitter.

DIY Codable Encoder / Decoder Kit

2018-05-07T00:00:00+00:00

In Swift 4, a type that conforms to the Codable protocol can be encoded to or decoded from representations for any format that implements a corresponding Encoder or Decoder type.

At the time of its release, the only reference implementations for these types were the Foundation framework’s JSONEncoder / JSONDecoder and PropertyListEncoder and PropertyListDecoder. The implementation details of these types, however, are obfuscated by translation logic from JSONSerialization and PropertyListSerialization.

The DIY Codable Encoder / Decoder Kit repository on GitHub makes it easier for developers to create encoders and decoders for custom formats. The template includes stubbed placeholders for the required types and methods as well as simple tests for encoding and decoding Codable types.

Just do a find-and-replace for the format name and rename a few files, and you can get right into the nitty-gritty of your format’s specific implementation details.

Encoder Structure

              public class FormatEncoder {
              public func encode<T>(_ value: T) throws -> Data
              where T : Encodable
              }
              class _FormatEncoder: Encoder {
              class SingleValueContainer: SingleValueEncodingContainer
              class UnkeyedContainer: UnkeyedEncodingContainer
              class KeyedContainer<Key>: KeyedEncodingContainerProtocol
              where Key: CodingKey
              }
              protocol FormatEncodingContainer: class {}
              
              

Decoder Structure

              public class FormatPackDecoder {
              public func decode<T>(_ type: T.Type,
              from data: Data) throws -> T
              where T : Decodable
              }
              final class _FormatDecoder: Decoder {
              class SingleValueContainer: SingleValueDecodingContainer
              class UnkeyedContainer: UnkeyedDecodingContainer
              class KeyedContainer<Key>: KeyedContainer
              where Key: CodingKey
              }
              protocol FormatDecodingContainer: class {}
              
              

For an example of this template in action, see this Codable-compatible encoder and decoder for the MessagePack format.

We’d love to see what you make with this! Please get in touch via Twitter to share your custom encoder or decoder.

Benchmarking Codable

2018-05-01T00:00:00+00:00

Swift Codable can automatically synthesize initializers that decode models from JSON. But how does this generated code compare to what it replaces?

To find out, let’s benchmark the performance of JSONDecoder against equivalent hand-written code that uses JSONSerialization instead.

Defining a Sample Model

In order to establish a baseline, we’ll create a model that reasonably approximates something that you’d expect to find in a typical app.

For this example, we’ll use the following Airport model:

                struct Airport: Codable {
                let name: String
                let iata: String
                let icao: String
                let coordinates: [Double]
                struct Runway: Codable {
                enum Surface: String, Codable {
                case rigid, flexible, gravel, sealed, unpaved, other
                }
                let direction: String
                let distance: Int
                let surface: Surface
                }
                let runways: [Runway]
                }
                
                

The Airport structure has String properties for its name, along with three-letter IATA and four-letter ICAO airport codes. The Runway type specifies a direction and distance, as well as a surface defined by a nested Surface enumeration. A set of coordinates are stored in a [Double] array, though we could also define a custom type for that, too.

By conforming to Codable in its declaration and having stored properties with types that conform to Codable, Swift automatically synthesizes the implementation for the init(from:) initializer required by the Decodable protocol (the same goes for the Encodable protocol and its encode(to:) method).

Airport may not check all the boxes in terms of Codable functionality, but it has sufficient complexity to extrapolate from our findings. And for whatever deficiencies this model has, it more than makes up for it by having real-world data to test with. As the saying goes, “Quantity has a quality all its own”.

By scraping Wikipedia’s “List of Airports” article, we were able to create a list of 7361 airports, from which we generated JSON files with 1, 10, 100, 1000, and 10000 objects (some records were duplicated to fill in the gaps for that last one).

                {
                "name": "Portland International Airport",
                "iata": "PDX",
                "icao": "KPDX",
                "coordinates": [-122.5975, 45.5886111111111],
                "runways": [
                {
                "distance": 1829,
                "direction": "3/21",
                "surface": "flexible"
                }
                // ...
                ]
                }
                
                

Taking a look at the relative sizes of these data sets:

Count	Size	gzip Compressed Size
1	271 Bytes	193 Bytes
10	2.8 KB	703 Bytes
100	33.0 KB	4.7 KB
1000	328.0 KB	44.3 KB
10000	3.2 MB	477.4 KB

Most apps don’t process more than tens of thousands of records at once, so our benchmark should be fairly representative as far as sample sizes go.

Manually Implementing a JSON Initializer

The conventional way to decode models from JSON without Codable is to implement an initializer that takes a [String: Any] type. A hand-rolled implementation of this approach for Airport weighs in at ~30 lines of code (maybe 5 to 10 minutes to write from scratch):

                extension Airport {
                public init(json: [String: Any]) {
                guard let name = json["name"] as? String,
                let iata = json["iata"] as? String,
                let icao = json["icao"] as? String,
                let coordinates = json["coordinates"] as? [Double],
                let runways = json["runways"] as? [[String: Any]]
                else {
                fatalError("Cannot initialize Airport from JSON")
                }
                self.name = name
                self.iata = iata
                self.icao = icao
                self.coordinates = coordinates
                self.runways = runways.map { Runway(json: $0) }
                }
                }
                extension Airport.Runway {
                public init(json: [String: Any]) {
                guard let direction = json["direction"] as? String,
                let distance = json["distance"] as? Int,
                let surfaceRawValue = json["surface"] as? String,
                let surface = Surface(rawValue: surfaceRawValue)
                else {
                fatalError("Cannot initialize Runway from JSON")
                }
                self.direction = direction
                self.distance = distance
                self.surface = surface
                }
                }
                
                

With effective use of guard statements and the map(_:) method, this isn’t a particularly unappetizing chunk of boilerplate. But it’s hard to compete with the zero additional lines of code required of Codable.

Creating Performance Tests

We use Xcode’s built-in testing framework, XCTest to measure the performance of each implementation.

In the setup for both tests (not shown here), a count is specified and the corresponding data set is loaded. Each test then decodes that data within a closure passed to the measure(_:)

                class PerformanceTests: XCTestCase {
                var data: Data
                var count: Int
                func testPerformanceCodable() {
                self.measure {
                let decoder = JSONDecoder()
                let airports = try! decoder.decode([Airport].self, from: data)
                XCTAssertEqual(airports.count, count)
                }
                }
                func testPerformanceJSONSerialization() {
                self.measure {
                let json = try! JSONSerialization.jsonObject(with: data, options: []) as! [[String: Any]]
                let airports = json.map{ Airport(json: $0) }
                XCTAssertEqual(airports.count, count)
                }
                }
                }
                
                

Benchmarking Execution Time

You can download the Xcode project used to produce these results on GitHub.

Because JSONDecoder uses JSONSerialization under the hood, we should expect the performance characteristics to be similar. And indeed that’s what we see here:

Wall Clock Time (Smaller is Better)

Count	JSONSerialization	Codable	Δ
1	0.5 ms	0.8 ms	+0.3 ms
10	1 ms	4 ms	+3 ms
100	3 ms	8 ms	+5 ms
1000	30 ms	51 ms	+21 ms
10000	382 ms	603 ms	+221 ms

Swift 4.1, Xcode 9.3 (9E145), iPhone X Simulator
2017 MacBook Pro, 2.9 GHz Intel Core i7, 16 GB 2133 MHz LPDDR3

On average, Codable with JSONDecoder is about half as fast as the equivalent implementation with JSONSerialization.

But does this mean that we shouldn’t use Codable? Probably not.

A 2x speedup factor may seem significant, but measured in absolute time difference, the savings are unlikely to be appreciable under most circumstances — and besides, performance is only one consideration in making a successful app.

If you have a codebase that uses JSONSerialization — whether directly or through a third-party framework — you might add benchmarks to see how Codable performs against your existing implementations. If performance is acceptable, you could then proceed to build new functionality with Codable before eventually transitioning existing code over.

Ultimately, every project is different, and it’s up to you to determine what’s right for you.

Codable isn’t a silver bullet, but it’s good enough that we should consider it to be our new default. Unless you have a specific reason to use JSONSerialization, Codable is an excellent choice for working with data representations.

Flight School

Training a Text Classifier with Create ML and the Natural Language Framework

#Acquiring a Corpus of Data

#Training the Model

#Evaluating the Model

#Accuracy

#Confusion Matrix

#Precision and Recall

#Writing the Model to Disk

#Testing Out the Model in a Playground

#Conclusion

Flight School for Chinese Language Speakers

Running Xcode Playgrounds on Travis CI

#Compiling a Playground from Scratch

#Building Across All Playgrounds

#Putting it All Together

#.travis.yml