Weaviate Go Client V6: Designing For Idiomatic Go
Hey there, fellow Go developers! We're diving deep into the exciting world of the Weaviate Go Client v6. This isn't just another update; it's a fundamental redesign focused on making your development experience smoother, more intuitive, and, dare I say, more Go-like. We've been pondering some key design decisions, and I'm eager to share our thoughts and get your feedback. Our primary goal? To embrace a more idiomatic Go approach, favoring simple structs for passing options and reserving variadic options primarily for client instantiation. Let's unpack what this means and why it’s a big deal for how you interact with Weaviate.
Single Insert: Embracing Simplicity Over Complexity
When it comes to single inserts in Weaviate, we've noticed a potential pitfall in the proposed variadic parameter approach. Imagine a scenario where you're trying to insert a single data object, but the method signature allows for numerous properties and vectors to be passed in. This can quickly become unwieldy. More importantly, it might not be as auto-discoverable as we'd like. For instance, a method like data.WithProperties would need to accept properties as a generic any type to accommodate ORM-like structs, such as a Song struct. While flexible, this introduces an internal implementation hurdle: to correctly marshal these ORM structs into the map[string]any format Weaviate expects, our code would likely need to perform a double unmarshal and marshal process. This adds overhead and complexity that we believe can be avoided.
Let's look at what could go wrong with the variadic approach:
songs := client.Collections.Use("Songs")
songs.Data.Insert(ctx,
data.WithProperties(Song{Title: "Bohemian Rhapsody", Artist: "Queen", Year: 1975}),
data.WithProperties(Song2{Artist: "Blur"}),
data.WithProperties(Sing{artists: "many"}),
data.WithVector(types.Vector{Single: []float32{0.1, 0.2, 0.3}}),
data.WithVector(types.Vector{Single: []float32{0.11, 0.22, 0.33}})
)
This example highlights how the WithProperties and WithVector calls could become repetitive and potentially ambiguous. What if Song, Song2, and Sing represent different schemas or have overlapping fields? The flexibility here comes at the cost of clarity and potential runtime issues if the internal marshaling isn't handled perfectly. It places a significant burden on both the developer to understand the internal workings and on the client library to gracefully manage diverse inputs.
Our proposed solution steers towards a more idiomatic Go pattern. Instead of variadic options for each property or vector, we envision a single, clear structure representing the object to be inserted. This approach makes the intent immediately obvious and simplifies the data structure:
songs := client.Collections.Use("Songs")
songs.Data.Insert(ctx,
&types.Object{
Properties: types.Properties{
"title": "Bohemian Rhapsody",
"artist": "Queen",
"year": 1975,
},
Vectors: []types.Vector{
{Single: []float32{0.1, 0.2, 0.3}}
},
})
)
In this idiomatic proposal, we pass a single types.Object struct. This struct clearly delineates the properties and vectors associated with the object. The Properties field is a map[string]any, which is a standard Go way to handle flexible key-value data, and Vectors is a slice of types.Vector. This design is more explicit, easier to read, and removes the ambiguity associated with multiple WithProperties or WithVector calls. It aligns with Go's philosophy of explicitness and reduces the chances of unexpected behavior during marshaling. This shift promises a more robust and developer-friendly experience for handling data insertions.
NearVector Query: Navigating Ambiguity with Structure
Another area where we're rethinking the approach is the NearVector query. The current proposal for using variadic parameters here also risks introducing ambiguity. Consider this example:
queryVector := types.Vector{Single: []float32{0.1, 0.2, 0.3}}
result, err := songs.Query.NearVector(ctx, queryVector,
query.WithLimit(10), query.WithLimit(100), query.WithLimit(1000)
)
What happens when query.WithLimit is called multiple times? Which limit takes precedence? The last one? The first one? It's unclear and prone to errors. This pattern, while seemingly flexible, can lead developers down a rabbit hole of trying to decipher the exact behavior or, worse, writing code that behaves unexpectedly. The lack of a clear, single source of truth for query parameters makes debugging and reasoning about the query difficult.
We believe a more structured, idiomatic Go approach would significantly improve this. Our draft proposal introduces a dedicated struct for NearVector query parameters. This encapsulates all the query options in a single, well-defined entity, eliminating ambiguity and making the query's intent crystal clear:
queryVector := types.Vector{Single: []float32{0.1, 0.2, 0.3}}
result, err := songs.Query.NearVector(ctx,
&query.NearVector{
Vector: queryVector, // Renamed for clarity
Limit: query.Limit(10),
}
)
In this revised approach, we pass a single query.NearVector struct. This struct clearly defines the Vector to search with and the Limit. If we needed to add more options, like Distance or Alpha, they would become fields within this struct. This makes the query definition explicit and easier to read. For example, adding a Distance parameter would look like this:
result, err := songs.Query.NearVector(ctx,
&query.NearVector{
Vector: queryVector,
Limit: query.Limit(10),
Distance: 0.5, // Example distance value
}
)
This pattern is highly extensible. As Weaviate adds more query capabilities, we can simply add more fields to the NearVector struct. It’s a clean, maintainable, and inherently Go-like way to manage complex query parameters. This design decision focuses on predictability and developer experience, ensuring that your queries are exactly what you intend them to be, without hidden complexities or ambiguities. We are committed to making interactions with the Weaviate API as transparent and straightforward as possible.
GroupBy Queries: Refining Syntax for Clarity
When it comes to GroupBy queries, we've encountered some syntax proposals that don't quite feel right for Go. The idea of extending query methods directly, like songs.Query.NearVector.GroupBy(...), presents a syntactic challenge. It implies a method chaining that, in Go, often becomes verbose and less readable, especially when dealing with multiple query types that might support grouping.
We've seen a potential implementation where a GroupBy method is attached to the NearVector query struct itself:
// Standard vs grouped NearVector queries - function-as-receiver pattern
single, err := songs.Query.NearVector(ctx, vector, query.WithLimit(10))
groups, err := songs.Query.NearVector.GroupBy(ctx, vector, "category", query.WithLimit(10))
This syntax, while functional, raises questions about how discoverable and maintainable it would be across different query types (like Get or Generate). Would each query type need its own GroupBy extension? This could lead to a proliferation of methods and a less unified API.
To illustrate how this might work with Get and Generate queries, an example has been shared via Gist, showing a potential path forward. However, we're exploring an alternative that aligns better with our struct-based approach for query parameters. Instead of attaching GroupBy as a method, we propose incorporating grouping parameters directly into the query struct itself, much like we did with NearVector.
Consider this potential idiomatic approach for a GroupBy query:
// Using a dedicated GroupBy configuration within the query struct
groups, err := songs.Query.Get(ctx, &query.GetConfig{
ClassName: "Songs",
GroupBy: &query.GroupByParams{
Property: "category",
Limit: 10,
},
// Other Get parameters...
})
In this proposal, the GroupBy logic is encapsulated within the query configuration struct (query.GetConfig in this example). This makes the intent explicit: we are configuring a Get query, and one of the configurations is to group the results by a specific property ("category") with a given Limit. This approach offers several advantages:
- Unification: All query parameters, including grouping, are managed within a single configuration struct for each query type. This leads to a more consistent API across different query operations.
- Discoverability: Developers can easily see all available options for a given query type by inspecting the configuration struct.
- Extensibility: Adding new grouping features or parameters is as simple as adding new fields to the
GroupByParamsstruct or the main query config struct. - Readability: The structure clearly separates different aspects of the query, making it easier to understand at a glance.
This aligns with our overall v6 design philosophy: use clear, well-defined structs to represent complex configurations and data structures. It moves away from potentially ambiguous method chaining or function-as-receiver patterns towards a more explicit and maintainable API. We believe this refined approach to GroupBy queries will make your interaction with Weaviate's powerful aggregation features more intuitive and less error-prone. Your feedback on this direction is invaluable as we shape the future of the Weaviate Go Client.