Tinker AI
Read reviews

Outcome

MVP shipped to TestFlight on time; designer-engineer collaboration worked better than expected with Windsurf as shared context

8 min read

A small client project: a habit-tracking app, iOS and Android. Team of three: two designers (one product, one visual) and me as the only engineer. Six weeks from kickoff to TestFlight beta.

I picked Windsurf for the project specifically because Cascade’s multi-step task handling looked like a good fit for the kind of work I’d be doing — implementing screens designers had specced, with a lot of visual polish and animation work.

The result was uneven but mostly positive. Here’s the honest breakdown.

The setup

Stack:

  • React Native 0.74 with the new architecture (Fabric + TurboModules) enabled
  • Expo (managed workflow) for the dev experience
  • React Native Reanimated 3 for animations
  • TanStack Query for data fetching
  • Supabase for backend (auth + Postgres + storage)
  • Expo Router for navigation

Why this stack: it’s a stack I know reasonably well, which mattered because I was the only engineer. No room to learn an unfamiliar stack mid-project.

Week 1: setup and architecture

Windsurf was useful here for the boilerplate parts. Cascade scaffolded:

  • The Expo Router structure (tabs + modal routes)
  • The TanStack Query provider setup
  • Supabase client config with environment variable handling
  • Authentication flow scaffolding (sign-in, sign-up, password reset screens)

About 8 hours of work compressed to ~3 hours. The auth flow specifically was a big win — the boilerplate around React Native + Supabase auth is annoying, and Cascade handled it well.

What didn’t work: design-system setup. The designers had a Figma file with custom typography, colors, and component variants. Cascade’s first attempt at translating this to a Tamagui theme was generic — not bad, but not matching our spec. I redid it manually. About 4 hours, all me.

Week 2-3: feature implementation

This was the most productive period. The designers handed me Figma screens for each feature (habit creation, habit tracking, daily summary, profile). My workflow:

  1. Open the Figma screen alongside Windsurf
  2. Describe the screen to Cascade in detail (or upload a screenshot)
  3. Cascade generates the React Native components
  4. I review and refine, particularly the spacing and typography
  5. I add the animations (more on this below)
  6. I wire up the data layer

Cascade was good at:

  • Component structure (which components, how nested)
  • Layout (Flex, padding, alignment)
  • TanStack Query integration (query keys, mutation logic)
  • Form handling with react-hook-form

Cascade was less good at:

  • Pixel-perfect spacing (consistently 4-8 pixels off in various places)
  • Typography hierarchy (not always picking the right variant from the design system)
  • Color usage (sometimes used hex codes instead of the theme tokens)

For pixel-perfect work, I’d iterate with Cmd+K to fix specific issues. For typography and color usage, my .windsurfrules file got tighter over time, and Cascade improved.

Average time per feature screen: 4-6 hours including animations. Pre-AI estimate: 8-12 hours. Significant speedup.

The animation problem

React Native Reanimated 3’s API is powerful but verbose. Animations require:

  • Worklets that run on the UI thread
  • Shared values for animated properties
  • Specific operators for derived values
  • Careful coordination with React’s render cycle

Cascade’s training has plenty of Reanimated 2 examples, fewer Reanimated 3 examples. Many of its first attempts used the older API. I’d correct, Cascade would adapt, and subsequent animations were better.

A specific case: a habit completion animation (a checkmark draws in, then a confetti burst). My first attempt with Cascade produced something that worked but felt clunky. After three rounds of “make it feel snappier” and Cmd+K refinements, the animation was good — but I’d spent about 90 minutes on a single animation.

For a designer-led product, this matters. The animations are part of what makes the app feel polished. I couldn’t afford to rough them out.

What ended up working: I built a small library of animation primitives (entry, exit, success, error, loading) myself in week 2. Cascade reused them in subsequent features. The library was about 200 lines; once it existed, Cascade’s animation work became much faster because it had patterns to compose.

Week 4: backend integration

The Supabase backend was mostly straightforward. Tables, RLS policies, edge functions for the few things that couldn’t be client-side queries.

Cascade did well with:

  • Generating Supabase queries (the client API is well-trained)
  • Writing SQL migrations
  • Generating TypeScript types from the database schema

Cascade did poorly with:

  • RLS policies (these have specific patterns; Cascade’s defaults didn’t match Supabase best practices)
  • Realtime subscriptions (the API surface is small but Cascade got the cleanup wrong consistently)
  • Edge functions (Deno-specific patterns confused it)

For RLS, I wrote the policies myself after a couple of bad attempts from Cascade. The policies are security-critical; I didn’t want to ship something I hadn’t reasoned through.

For realtime, I refactored Cascade’s first version to fix the subscription cleanup issues (memory leaks otherwise). The pattern is now standard in our codebase; future realtime additions go faster.

Week 5: polish and edge cases

Week 5 was the kind of work that makes the difference between a tutorial app and a real app:

  • Loading states (skeleton screens, not just spinners)
  • Error states (with retry, with analytics tracking)
  • Empty states (with illustrations, not just text)
  • Accessibility (VoiceOver labels, dynamic type)
  • Performance (FlashList for long lists, image caching)
  • Offline handling (TanStack Query cache, optimistic updates)

Cascade was useful here but with diminishing returns. Each of these is “small but mattersome.” Cascade’s defaults are bare-bones; I asked for the polished version and Cascade produced something passable. I’d refine.

Net result: I spent about 50% of week 5 on polish. Without Cascade, this might have been 80% of the week. With Cascade I covered more ground but the difference was smaller than in the productive middle weeks.

Week 6: TestFlight prep

Build configuration, app store assets, privacy declarations, testing on real devices. This was almost entirely my work; Cascade contributed minimally.

The exception: TestFlight rejection on first attempt due to a privacy declaration mismatch. Cascade helped me debug what I’d missed. Saved maybe an hour of investigation.

What worked

Cascade for screen scaffolding. The bulk of an app’s code is screen components with similar shape. Cascade produced these efficiently.

Cascade for backend integration. The query layer (TanStack Query + Supabase) was a sweet spot. Lots of repetitive code, well-defined patterns.

Strict design system in code. Spending the first week building a robust design system paid off across the rest of the project. Cascade’s defaults were never as good as my system’s, so the system carried the polish.

Designer-engineer collaboration through Windsurf. The designers couldn’t read the code, but they could ask Cascade questions like “why doesn’t this animation match the spec?” and get explanations. Cascade explaining code in plain language was a useful interface for the designers.

What didn’t work

Cascade for animations. The Reanimated API is just not in the model’s strong zone. I spent more time on animations than I should have, including iteration with Cascade that didn’t help.

Cascade for RLS/security. Anything where the wrong code is dangerous, I didn’t trust Cascade’s defaults. Manual implementation was the right call.

Cascade for cross-cutting refactors. When I needed to change a pattern across many files (e.g., updating the loading state pattern after I built the skeleton component), Cascade missed files. The refactor was easier to do with sed-like operations than with Cascade.

The honest productivity assessment

Total project time: 6 weeks.

Estimate without Cascade: 8 weeks.

Cascade saved about 2 weeks. Most of the savings was in weeks 2-3 (the productive middle). Diminishing returns in the polish and integration phases.

Cost: Windsurf Pro subscription ($15/month × 2 months) plus about $40 in extra Flow credits = $70 total.

For a client project, this is a great ratio. The 2 weeks of saved time is worth far more than the tool cost.

Would I use Windsurf for the next mobile project?

Yes. Cascade fit the work well, especially the screen-by-screen implementation pattern. Flows wasn’t a great fit (most tasks were too small for Flows to be worth it), but Cascade carried the load.

The caveats: I’d budget more time for the parts Cascade does poorly — animations, security-sensitive code, cross-cutting refactors. These need engineering attention regardless of AI tooling, and I underbudgeted them in this project.

For pure greenfield mobile work with a clear design system, Windsurf is competitive with anything else. For more complex apps (heavy animation, heavy native modules, heavy state management), the gap closes — Cascade’s defaults match the simple cases, and the complex cases are still mostly engineering work.

What the designers thought

The two designers were skeptical about AI tools at the start. By the end:

  • The product designer’s view: “It made the implementation faster, which meant we could iterate more on the design. Worth it.”
  • The visual designer’s view: “It’s not as good at visual polish as I’d like. I had to push back on details a lot. But the basic implementation was fast enough that we had budget for the polish iterations.”

These are roughly my views from the engineering side, in different language. The cross-disciplinary takeaway: AI tools shift where the bottleneck is. The bottleneck used to be implementation speed; now it’s design quality. Teams that have strong design quality are well-positioned to take advantage. Teams without strong design quality may produce more, but the more is mediocre.