Ep # 116: Balancing Innovation and Scalability in B2B Data Integration: Insights and Industry Trends ft. Peter Fishman (CEO & Co-founder, Mozart Data)

Episode Summary

In this comprehensive interview with Adil Saleh and Saad Ali from the Hyperengage podcast, Peter Fishman, CEO and co-founder of Mozart Data, shares insights into the company's origins, its mission, and its approach to simplifying data centralization for B2B companies. Fishman highlights his journey from academia to technology, emphasizing the pivot from complex data analysis to making data accessible and actionable for companies without extensive engineering efforts. He discusses Mozart Data's role in enabling businesses to leverage the modern data stack, the impact of AI on democratizing data, and the importance of building a culture that values the Pareto principle and continuous innovation through hack days. The conversation also touches on industry trends, the challenges of scaling data integration platforms, and the balancing act between developing specialized solutions and offering a comprehensive, integrated data stack. Fishman underscores the company's focus on customer success, driven by making data actionable and optimizing costs, and shares Mozart Data's plans for embracing AI enhancements and expanding its engineering team to continue evolving its product offering.

Key Takeaways	Time
Mozart's focus on making data activation easy for customers	00:01
How their most successful customers invest in proper data setup	07:30
How costs can ramp up with tools like Snowflake and how Mozart helps with cost management	11:00
Debate on tools for startups like Snowflake vs more cost effective options	15:30
Mozart's focus on their core capabilities vs trying to innovate broadly	25:00
Mozart's culture of embracing the 80/20 rule and regular hack days	35:00
Their primary hiring needs are for engineering roles	38:30

Transcript

Adil Saleh (0:00:03) - Hey, greetings, everybody. This is Adil from the Hyperengage podcast, a very special episode. It was pretty much long awaited. I have my cohost, Saad, with me. He's also kind of the technical co founder of a B2B SaaS company. I have Peter Fishman with us, who's the CEO and co-founder at Mozart Data. They are helping B2B companies build teams with data centralization, which is something that we speak a lot, like how you can unify data points, how you can have a centralized data points without any engineering efforts, and how you can streamline your infrastructure from the back end for scale. So I really appreciate, Peter, for taking the time, and thank you, Saad, for joining today. Saad Ali (0:00:42) - Thank you. Peter Fishman (0:00:43) - Thanks for having me. Adil Saleh (0:00:45) - Perfect. So, Peter, I was kind of intrigued. You've been walking through, like, of course, with an engineering background, technical background, you've been a part of different analytics roles. And I was just kind of curious, how did just that, like, connect the dot with what you're doing at Mozart data? Pretty walk us through your journey. It seems like pretty inspiring. Peter Fishman (0:01:07) - Sure. You know, to steal a line from one of my idols, Steve Jobs. It's sort of easier to connect the dots backwards. So when I started my career, I was, you know, like many people in data, something of like a failed academic. So I had done some sort of graduate work in basically economics and statistics. And I really, you know, always loved data, drawing conclusions and insights from data, and ended up sort of finding my way into the tech world, trying to take what was then big datasets and find magic and insights in them. And very quickly, I realized that for most companies, it's less about an advanced technique. I spent years trying to be really precise about certain estimates in my graduate school work. But that ability to spin up inference really quickly on sort of imperfect data sets that have a lot of messy data became sort of a real sort of centerpiece of the work that I was doing at the companies I was at. So sometimes the challenge was that the data was big but messy, or sometimes the data was very sparse, or sometimes there were just a lot of errors that came up. And this became a theme at a lot of the different places that I worked at. And often we faced a lot of challenges spinning up infrastructure, or the right infrastructure at these places. It involved a lot of resources, people, and then expensive products. And I was building the same thing over and over again. Now the underlying technologies were changing and getting better and better at each new job stop. But ultimately, in 2020, at the start of the pandemic, I decided to work with my good friend Dan Silberman on building the type of stack that we would implement at whatever the next company we would go to. So effectively building ourselves as a service. And Dan and I sort of created Mozart data with the goal of getting companies set up on the modern data stack, allowing them to sort of harness the power of the modern data stack that used to be sort of reserved for the folks that really could afford someone like Dan. And that was kind of our vision for the company. Got going April 2020. So a true pandemic company and been at it for almost four years now. Adil Saleh (0:03:51) - Amazing, amazing. You know, I was also curious about the fact that a lot of these companies post, you know, AI evolution after this chat. GPT for, they're trying to make data more actionable for CS, for sales, for all these GTM functions and all, and now they're evolving in the product teams and the developer teams as well. How do you see this industry from this point on? Like, when you talk about data being, of course, accessible to non technical teams as well as more actionable? Peter Fishman (0:04:25) - Sure. So this wasn't an AI trend. It's certainly been the trend of the past two decades to get more data, to democratize data, to make data accessible to sort of all the teams that you mentioned, not just the product team, but like the CS team and sales teams, and just making better decisions across the board at companies with data. And that starts with good central data and then that ability to consume those sort of centrally defined data or sort of well understood data or clean data or good tables, and having that ability to consume that in sort of the far reaching parts of the organization. A good example is maybe a customer success team, but really it's sort of almost everybody at the organization. It maybe was historically sort of housed within a Bi or a technical team, but really the power is when the operators and decision makers can use data at their fingertips. So a big part of the last, you know, this was again, well before the AI movement, though I do expect AI to very much accelerate it. A big part of the last, you know, 20 years has been, you know, in democratizing sort of big data and sort of moving that capability. And the most notable step has been in the B.I tool. So the B.I tool was all about having sometimes drag and drop ways. But really, given a well defined data model, how can either technical people set up reports and then those reports be consumed by the operators, or I think the much better model is when you have a partnership of someone really savvy and capable with not just sort of reading the data, but also cutting the data and reshaping the data and cleaning the data and understanding and knowing the data, partnering with somebody that understands the business, the business context, and is going to operate against that data. So there's sort of the self serve model where you've seen a lot of sort of progress. Now that's sort of the progress on the B.I side. I'm very passionate about all the stuff that goes on sort of under the hood of the car. So I don't want to neglect and I sort of live in the world of all of the progress that has been made, obviously on the data warehousing side, but not just that, what I would think of as the ETL T side of the house as well. So all this stuff under the hood has also made incredible steps in the last five to ten years. Saad Ali (0:07:02) - Very interesting. Very interesting, Peter, just like trying to understand. And this would be good for our audience also. Like what would be the ideal customer of Mozart data? Is it a company that's already has and a homegrown data pipeline where they have some sort of ETL framework going in, they have their own warehouse, they have the team, or like, is it a company that is just starting off at this point where they are at a position where they have a lot of data and now they need to start making more sense of it? So where does like Mozart data stand in terms of their ideal customer profile? Peter Fishman (0:07:36) - Sure. Probably the worst answer is a little bit of both. So. So at the highest level, we create the most value for companies in that latter group. So we help companies get started on their data journey where they typically don't have a powerful columnar warehouse. So they're trying to get serious about the data. They might be querying their production system, they might basically have data in the form of CSV files. They're using SaaS tools, they're creating data and they need to figure out a way to centralize that. We do that, I think, incredibly well. So we basically help them ETL their data and set up a warehouse. And basically in under an hour, they've got the power of the modern data stack. So our ICP can be that group that you just described, sort of. They're at a very sort of starting point in their data journey. But there is a really key ingredient that is maybe a little bit hard to identify on the surface. And that is that sort of commitment to data, that sort of passion for data, that wanting to consume the data, for that to be sort of. I think there's a lot of companies that generate a lot of data, that have a lot of data and that can actually build really incredible and amazing pipelines. But those might be just for show and not for action. So, you know, a Mozart ICP is one that basically really wants to use that data. What we find is that very quickly companies scale their consumption. So maybe they're, you know, doing ETL from a single source or multiple sources or a database, but actually they're going to really want to rich in their dimension tables, they're going to add columns and they're going to pull in more sources and their data consumption is going to really rise. And that's our true ICP. And that falls more into your first category. So that falls into the category of. So, you know, our business is to grow with the customer. It's to start at the starting line alongside them and help them get going. But really there needs to be something sort of intrinsic to the company that is all about, okay, we want to start really leveraging the advantages that are sort of hidden within this data. And they typically have somebody, and this doesn't have to be a data scientist, but it's someone that really sort of maybe has seen good data practices at other companies, or maybe they're just passionate about data and those folks tend to be really RICP and we'll sort of go along the journey and scale with them, but be a great partner to them even as they sort of reach bigger and bigger. Unicorn Decacorn status interesting, interesting. Saad Ali (0:10:32) - Now, just like understanding as we're jumping into this more at a high level, how would Mozart compare with a solution if a company goes towards, let's say, using airbyte and their own warehouse, where they just pick up all the data sources throw in there, use DBT for data transformation layer. How does Mozart fit into that, which I think you guys leverage fivetran for the connector bit? And how's that partnership also working? And then we can move on to much more into detail how all this glues together. Peter Fishman (0:11:05) - Sure. So there are many ways to sort of create the modern data stack. There are many blog posts on effectively the open source modern data stack. And putting together the modern data stack I think is really important for companies at, at the start. What's really great about the modern data stack is that almost all the vendors have largely aligned on usage based pricing. So companies that are not consuming that much data have small bills. Companies that are consuming lots and lots of data have larger bills, sometimes very large bills. So you're right, as a tool under the hood, we use a variety of EL vendors, including white label, five trans. So we sort of want to offer a best in class experience for a number of our connectors. We also have sort of services for long tail connectors. We also write our own connectors. We also use open source connectors. So there's, our philosophy tends to be, you know, if it can, you know, if there's an API and we can move it to a data warehouse, we're going to figure out a way to do that. So I think basically, you know, I think for most of the standard SaaS tools, I think there's a lot of, you know, great connector options. You know, you mentioned airbyte. We're, we're a Fivetran shop. But you know, I think, you know, Fivetran is probably, you know, a best in class for a number of SaaS tools that you might be wanting to get data from and move to your warehouse. But we try to offer a best in class experience. Then under the hood on the warehouse side, we offer managed snowflake or bring your own snowflake. And then we have a lightweight transformation layer that has a lot of the components of the teespace, including some run histories and versioning and a lot of things that sort of enable lineage graphs that basically help enable you to debug issues within your pipeline. But we also offer hosted DBT for folks that are very familiar with that tool and able to use that tool very effectively. But our main goal is to, is get a company going on their data journey. And again, given the billing model that I just mentioned, we hope that that starts at a pretty early stage because companies will very quickly get value out of their data. And if they have someone that really wants to look into it, they will find real opportunities within it. Saad Ali (0:13:58) - Okay, got it. Just like as you mentioned, the billing model, how does the pricing structure work for different kinds of companies? With Mozart, like a lot of these companies, like, you know, the charge for active roles synced in. That's a model. I think Airbyte follows and then, you know, there are multiple other solutions. I think Fivetran also had has a similar kind of model. So as you're building on top of it, how do you ensure that you stay on a flexible model where you can like scale up with the customers that you're onboarding? Peter Fishman (0:14:26) - Sure. So we don't really want to reinvent the wheel on pricing. We want to basically reinvent the wheel on the way these tools like glue together. And you know, we charge very similar to the rest of sort of the modern data stack. So we charge on kind of like, you know, active rows and we charge on compute. And so in that sense, we're very much not reinventing the wheel. That said, we do have some bundles. You know, part of our core philosophy is basically, you know, to bundle these tools together. And similarly we have like bundled and easy pricing. Generally, small startups like to have a little bit more sort of like certainty around their bill, so we'll sort of put them on a bundled package, whereas sort of most of our larger customers are on sort of a usage based plan that we sort of either tailor to their consumption based on their consumption history, or we sort of go with the traditional model on the usage metrics that you mentioned. Adil Saleh (0:15:38) - Amazing, amazing. So now slightly taking it to a different curve, now thinking of a company started back in 2020 post COVID, almost in the COVID when it was on peak, reaching their 1st 1 million annual recurring revenue. What was the key? Like, the biggest challenge that we face when we talk to the companies is the onboarding challenge. When these data platforms come in, the biggest challenge is the time to value during the onboarding. So that is sometimes like one month, two months, three months. We cannot, like they tell us one month. It's mostly three months, two to three months. So how did you crack this onboarding code in the beginning and when, you know, you had like, you wanted to jump into smaller companies and then you went up market, slightly up market up to now, sitting at three day, what was, what was the onboarding look like at that time? Any lessons that you've learned in the beginning for all these startups that are thinking of going in hyper growth mode, but they're still struggling inside, in house? Peter Fishman (0:16:41) - Sure. So I think I'd be lying to say that it just sort of instantly snapped and clicked for us. And then a lot of what we did in 2020 were hacks. So Dan and I, shortly after starting the company, participated in Y combinator. So at Y combinator, if you release something you're really proud of, you've done something wrong. That's a little bit facetious. But the thought there is you should feel anxious about the product you're putting out so that you can get that sort of rapid feedback iteration. And we sort of had a bunch of sort of guinea pigs within our YC batch and sometimes it went horribly wrong, as in the product didn't work, the syncs didn't happen. We expected them to be by the end of the call counting every metric that they cared about. And of course that didn't happen. And then also, you know, our product was a lot of Dan and I doing things in the background. So it wasn't paperware, but it was a lot of, you know, the way that you set up this complex infrastructure is you click this button and in reality it was then, you know, Dan doing a bunch of actions needed to get the customer set up. Now, these customers ended up sort of becoming very happy customers and ultimately paid customers. But these design partners were really getting me and Dan disguised as a product. And, you know, honestly, it was a quick way to learn. I mean, I think we had a very different bend on the product at the start of YC than in the middle of YC. Our product changed probably most in that sort of two month period between kind of where we started as, as an idea and what there was actual pull and what customers wanted and actually what got customers off the ground. You know, my intuition was that it was all centered around the data model. Ironically, our work now looks a little bit more around the data model, but in practice, it was actually more about getting their ETL set up, getting their data warehouse set up with some good defaults that would set them up for success. So our product has changed quite a bit in four years. But I think I would be dishonest if I said that our product in 2020 was amazing. It wasn't a product, it was humans and a hint of a dash of product. And I think that that actually did enable us to do an important, we don't call it a pivot, but I think you could classify it as that important pivot of the company to the product, largely that you see today, which is an all in one data platform. Saad Ali (0:19:44) - Interesting. So just like you mentioned, the YC journey, we have talked to, like, so many of, you know, YC founders coming in, everybody has a different story of, you know, how they got started, how they transitioned towards. But a very common thing that's a practice I think, you know, strongly preached by YC is like, you know, the kind of model that you followed where it was like, you know, you package yourself as a product and just like you were doing a lot of things in the background, like starting off what, know, for people thinking of, you know, trying to get into YC and, you know, these companies, like, you know, it's, it's, a lot of people try going after the same thing. So what's your recommendation to, you know, people applying for like, the upcoming batch or something? Peter Fishman (0:20:27) - So, so these things are a little bit separate. So in terms of, like, if you want to start a company, you have to be crazy passionate about that thing. You have to feel like it must exist in the world and you have to want to make that happen. I've been at the journey for four years and we're really scratching the surface. So this is something that Dan and I, and I love working with Dan, but Dan and I are really committed to working on for a long amount of time. And they kind of don't tell you that when you get started now, you sign up through one of the services or you click a few buttons, then you're off to the races. But in practice, it's a real commitment to yourself, your co founder or co founders, and then ultimately your investors and employees. But really, it's a huge commitment to yourself and your co founders. So in terms of getting into YC, Dan and I at the very least, had a lot of founder market fit. The amazing truth about YC is all of the tricks on how to get in are all like public information. So day maybe negative two of YC. Before even the program started, we were required to write a two sentence description of what we do. What is Mozart data in two senses? And they give examples about Airbnb or stripe in a way that you can understand, okay, this is what Airbnb does. And this is also like their interview criteria. So if you ask me to describe Mozart data, and I sort of almost like you did in your first question, and then I ramble about it for five minutes, this is really unappealing to a YC partner they want. In the same way that you can very crisply describe Airbnb or stripe, you have to be able to very crisply describe in simple layman terms what it is that you're doing. The next thing to know is that if not all, most YC companies pivot or micro pivot or whatever it is within YC or within their journey. So really, and this is true for seed investors, too, it's about a lot of conviction around the founders. Now, for me, a cheat code was working in a space, you know, to be totally honest, Dan and I were two of the, I'll just call them the most experienced, meaning sort of, well, I'll just, you know, we'll just say I like that term rather than the other term in our class. And, you know, that didn't give us like leg up. In fact, you know, Dan and I actually got waitlisted from our batch ultimately to get off the waitlist in re interview. But, you know, I think we had a real, both of us had a real sort of founder market fit. So we understood the problem, we had a lot of opinions about the problem. We had a lot of sort of unique angles and tacks on the problem. Something that really helped us be successful within YC is that I really did leverage my own network to do a lot of what I would call free work with my design partners, which ultimately served as a real proof point for the next step. So one of the things in startup world and in YC and really just in life is that you get into these virtuous cycles that sort of, again, it almost ties back to your first question of connecting the dots. It's connecting that sort of backwards. What you'll find is your successes within our design partners led to, honestly, some pretty small contracts within those folks. But those small contracts did expand and that expansion led to a lot of success cases, not just on G2, but on our blog. This is some of our best marketing and GTM and sales materials all relate to those initial successes. And then those beget more customers. And then you hope to build off of those successes. The B2B world, the YC world. When you work out of communities, these communities end up being small. So it sort of starts with really good work. One of the cheat codes to really good work is being an expert. One of the cheat codes is to be really passionate about it, but you can get there without either of those things. But it's just harder. And the journey is hard enough as is. Adil Saleh (0:25:07) - Yeah, absolutely, absolutely. And thinking, and this is what I'm asking for a lot of these platforms more towards like data integration, not exactly ETL, but you know, data integration platforms, they have real hard time tracking their go to market initially because they're stuck in the onboarding. Their, their sales cycle is pretty big. They're thinking of just like YC methodology, do things that don't scale. What I, what I think is it's, it's not, it's not, it's not, that is, it's not, it's counterintuitive. In data integration platforms. When you're building a B two B SaaS for data integration, you got to make sure that infrastructure is pretty solid, scalable, even if you have like a couple of customers that can like, have millions of data points that they're tracking on day one. So you got to make sure the infrastructure is pretty well scalable. So what do you suggest for these platforms? Only for data integration that are just getting in, sitting at c to series a in the first two years and, you know, they're kind of struggling in the data modeling and they're, you know, when. When they get bigger customers. There's so much of hand holding, so much of, you know, white glove services, a lot of cost involved as well. So I'm sure this is quite a broader question. So if you just can specify to some of the cases you might have faced with Mozart data early in the days. Peter Fishman (0:26:21) - Yeah, I mean, the short version is I'm a coward. So, like, I shy away from these types of really deep data engineering work that inevitably comes with, you know, sort of, I'll just call it, like, janky systems, more power to the folks that do it. And there are huge businesses in it. So I want to be very clear. We went down market to avoid these problems. There are very, very lucrative businesses to be built fixing these problems. We just, at least to start our business, kind of ran away from a bigger fight like that because we figured we could do sort of the opposite. If you've got largely vanilla problems, we could figure out how to do a great job, like a world class job at that, in a scalable way. So that's been largely my own philosophy and approach. Now, that said, I think that most successful businesses figure out a way to move upstream because those are the upmarket, really, those are the biggest checks. And typically any deal that's going to involve a lot of handholding. And by the way, kind of like you said, these data integration platforms end up really being a lot of consulting and a lot less like, product. Now, that was certainly our initial model, and we do have today we have a PLG motion, so you can get on Mozart data for free. You don't have to talk to anybody. But this is actually only a small part of our business. While we have that offering, and I'm really proud of that offering. It's not sort of really the most lucrative parts. Unsurprisingly, the most lucrative parts are as you move up market. The one thing I'll say is, if you have a number of customers, you do start to see problems repeat themselves. And that's where, again, I can say this, because it's really not a compliment about me. I do believe Dan is a visionary. I think he's got a beautiful vision for what a really great data stack looks like at a technology company. I believe he is a visionary in that respect. But it helps a lot to cheat off of your customer requests and your customer actions. So having that data, it's really the marriage of those two things. You want to have the vision for how to solve a customer problem, but you want to really see what the customer problems are. And what they're willing to pay for to really understand what makes for a good business. And then you need to, you know, there's one piece that I haven't mentioned, which is you have to actually go ahead and execute against it. It's not just good enough to figure out the solution. You have to actually build it and build it well. And for that, we have a really great product team. So that's. That's not really Dan. That is the, you know, the folks that work on Dan's team, but it's certainly not me. So I will. I think I can very easily brag about it because it's really not. It's not boasting. Saad Ali (0:29:55) - Yeah, you mentioned this. I'm sure, like a lot of people, like, you know, a two founder team, they would love to hear this. Like, you know, me and Adil, we also have this conversation a lot. You just mentioned Dan, right? So just like inside Mozart data, if you would divide around, like, you think, you telling us that he's a visionary in terms of thinking, like, you know, the technology side of it, what the perfect data stack looks like, and you jumping to the other stuff. So what does your coordination look like and how does that help? Help you along the way? Peter Fishman (0:30:26) - Yeah, so I think we're a little bit non traditional. So we don't have a CPO. We have Dan, who is our CTO, but also really our CPO. We do have a product team as well. We have Dylan, who's our vp of revenue, who sort of runs the GTM side of the house. And Dan and Dylan report to me. And what I do is a little bit ambiguous. So I am the co founder and CEO. Typically, a CEO takes a very active role in GTM or a very active role in production or a very active role in sort of blending those two. In practice, Dan and Dylan work extremely well together, and I'm sort of out there. So there's sort of, like, I think, a little bit less traditional of a setup. So I think a typical setup would be a CEO is very capable as a product manager and works very closely with the CTO to co develop product. While I have lots of opinions on data stacks, and I've spent the last sort of 20 years of my life sort of living that world, I do see that vision as largely Dan's. And then when it comes to GTM, Dylan is just a lot better than I am. So he is just. Our team's in much more capable hands. I was our first salesperson, and I. I like to say Dylan didn't get us to where we are. I got us from like zero to over $20,000 of ARR. And, you know, that is, of course, a punchline now to a joke, but it is also factually true. And, you know, and then Dylan really has taken it from there. So, you know, my sort of, I think there's now sort of this new role for CEO. And that's not really like be like a, you know, a TikTok influencer, but it's really to immerse yourself in the community and to really understand some of the sort of underlying movement of the industry to connect within the industry, to be able to make connections within the industry. Some of our, you know, most powerful GTM motion tends to be with downstream partners of ours. So folks that the customers kind of think what I call from right to left and I think from left to right. So I think, like, the data gets created, then the data gets moved to a warehouse, then you sort of clean the data and then you use it downstream. And actually, most people think I'm sitting in a board meeting presenting this brilliant graph and then I need sort of, you know, a tool that's going to help me create the graph. And then they finally get to sort of the left side. And my job is sort of to connect within the ecosystem to question sort of Dan and Dylan. Um, you know, I sort of have carved out a unique CEO role, but I would say that the more traditional looks like either like GTM centric or product centric. Um, and I've sort of, I will say I've sort of gone my own way, but I think it sort of speaks to the strengths, the different strengths of the leaders on our team. Adil Saleh (0:33:37) - Yeah, very interesting, because, you know, like Steve Jobs always say, like, if you make right decision on top line, bottom line will follow. You got to make sure that you have the vision. You're customer obsessed. You're looking at the market every day, day in, day out, and you're engaging with the community, what's going around, and then you take that exposure into the team and then you work with, like, your two other co founders. Pretty much compliment, you know, for you to have that exposure and get it to the works, like, executed and get it reflected to the, to the product and technology and solutions and all. So now thinking about your, you know, as a product, like from, from a product net growth standpoint, you've invested heavily into it in the last few years. How do you see it? Like, is it like core product led growth or is this just having some sort of, you know, account executive or CSM touch. Talking about only GTM teams and revenue teams. Peter Fishman (0:34:31) - Yeah, I mean, I think I'd sort of start backwards. Our most successful companies have like working data models, so they really use the data. So in enterprise, a little bit more than SMB, you can sort of get away with like I'm just going to say this, doing almost nothing. So there exists a lot of shelfware in enterprise, whereas in SMB you tend to, you know, care a lot more about kind of costs and, you know, I mean, because a lot of our companies are, you know, they're grinding, they're startups, they're even even larger startups or mid market, you're still sort of grinding. And especially post 2021, after 2021 budgets in the data space really did need to be rationalized. So it's almost sort of across the board. But what we find is that our most successful companies have, have sort of thoughtful data models. They might not all start with like big data science teams. In fact, they don't. Most of our, you know, our ICP tends to be operators that can be, you know, cpos, ctos, rev ops, biz ops, marketing ops. Like it really is sort of like data obsessed folks without sort of the data engineering chops or background. But what we find is that they all kind of invest in sort of getting data set up right and then ultimately they're very successful with it. And typically this is driven by their own passion and their own capabilities and their own skills. Sometimes it's like spreadsheet skills, but sometimes it's like spreadsheet skills that really grow into really strong SQL skills and bi skills and reporting skills, but it's really somebody that knows how to maneuver the sort of operating of the company. So working backwards from that, our most successful implementations involved working hand in hand with that team. If we know that we're going to succeed when that company is consuming a lot more data and we know the key to them consuming a lot more data is having those successes. There is what we think of as very worthwhile upfront investment in terms of really giving a push in the back on the swing of getting sort of their, their connectors connected and their data model up running. And, you know, we've won some awards in terms of our customer service. I am super proud of that team. They're incredible. But it's for, you know, greed reasons. I think that actually like, getting them successful is going to make us more successful in the long run. So you can, you know, do the, you know, the calculations on that and we have. And what we found is that basically, you know, our business very deeply relies on good net dollar retention. So it relies on these customers expanding. And to do that, you need to. I don't like the word hand hold, but I think it's really just give them a real push in the back. Now, there are many successful companies that need none of that. They just need to be off to the races. And that's where our PLG tool is. A lot of technologists, honestly, no offense, I'm such a big fan of Dylan's, but just don't want to talk to a salesperson. And we wanted to sort of offer that route as well. So I think that route is there because there are ways to succeed without having your handheld. And not only that, some people explicitly don't like it. So we want to try to meet our customer where they are now. One of the things I will say is that our customer profile, while I describe them pretty well, sort of almost like a sketch as opposed to like a 4K photo, it's like, what I will say is, because our customer profile is a little bit strange, different customers behave very differently. They have different data chops, they have different data expectations. So part of that is really understanding where that customer is. Saad Ali (0:38:44) - Okay. Peter Fishman (0:38:44) - Yeah. Saad Ali (0:38:44) - One, one insight that you mentioned, I think that is very key, that a lot of folks that are listening in should understand is like, you know, if you have a key stickiness factor, it might involve several touch points where you're interacting with the customer. But if the stickiness is good, like, for example, the value that you're adding in in terms of your customer, as soon as you get that data activated, the stickiness is a lot because once they start seeing value out of it, they're going to start investing more into it rather than go down. So at the end of the day, any data that you're operationalizing in turn, it is turning better retention for your customers, also for their revenue teams getting more understanding out of this. So that's, I think, a very key factor. And one thing that you mentioned was cost optimization side of it. So just thinking towards a lot of these added tooling that comes in once you start to jump into it. Previously it was, you know, we had a database, then came in data warehouses. Now, you know, using in Snowflake, I've heard so many people talk about, you know, how costs start to ramp up and everything on that angle. So, like, how do you ensure, like, you know, for your customers when they're starting at this point, does Snowflake make sense for them in terms of cost or like, how does that ramp up? Peter Fishman (0:39:53) - You know, powerful columnars are very expensive. So not just Snowflake redshift, bigquery, they run the risk of being very expensive. You could just be querying and querying and querying and set your warehouse size bigger and bigger. Have an SLA. Like, I need that result right now, right now, right now, right now. And that can really run up your bill. Or you could run a job that's essentially just gobbling up compute. So it is certainly the case that large organizations have the ability to rack up very large bills, cloud bills, period. The one thing I will say, and this is not a defense of our pricing or snowflakes pricing. The most expensive part by far is not the data infrastructure, is the human, and also the customer acquisition costs and the sort of unit economics of the business. But the most important part, and the most expensive part is by far the human. That said, it is very easy to run up a large bill in any of the cloud data warehouses. They are incredibly powerful. The prices have come down. Snowflake. Pioneering the separation of storage and compute has allowed a totally new approach to, you know, effectively ELT. So really bringing a lot of data to your warehouse and then sort of making sense of it effectively, you know, once you know, it's in the warehouse. So, so all those things are true. There are a lot of sort of quick wins to be had. So, you know, we do things like smart defaults. So we do things like turning off your warehouse as opposed to maybe racking up additional bill as it's not being used. We are doing a lot of things around what we call smart home. So I'm currently at the office. I'm not at the house. The heat gets lowered. I typically get home, you know, in the evening, and before I get home, you know, the heat will turn back on. So I experienced the house largely the same way. My dog might be a little bit chilly during the day, but at the end of the day, this is an analogy, of course, to how people should run their warehouse. So they probably don't need their data crazy refreshed at like three in the morning. But they also might, some nights they might like, literally be dealing with some sort of fire, and they need basically that data refresh. So they should have the option to do it, but it should default to, and the same way that if I'm home, if it's a holiday, I can turn up the heat during the day. But I think having smart defaults, having systems that are smart about when you're actually consuming data, when you. When you want to. I mean, ultimately, where people's bills get really big on the infra side is when they're basically trying to have almost like ridiculous SLA's relative to how they consume data in reality. So, you know, a lot of times, you know, when we talk to maybe a more novice data person, you know, they'll say, okay, well, what, you know, what do you need your data, you know, well, near real time or real time? Real time. Well, that basically, you know, changes the equation quite a bit. So now you're talking about needing totally different services. You're talking about needing, you know, basically a very different level of service. And that's a capability that's not actually used when you're doing sort of bi or analysis or even sort of downstream of the warehouse type data activation. So that's, I think it's really about matching the real need, not kind of like the stated need. You know, I don't, you know, I drive a Toyota Camry. You know, I don't need, you know, a Ferrari to get me, you know, to and from the supermarket. I wouldn't mind it, but I think it would be the sort of wrong use of my own personal resources. And the same is true. Now, you then ask the question, okay, should startups really be using a powerful tool like snowflake? My answer is, of course, I'm biased, but I won. Yes, again, because of this sort of separation of storage and compute, you can get away with snowflake being rather inexpensive. But it's, again, very usage centric and the most expensive. If you use an extra small warehouse, like an hour, single credit retails around $3. I mean, some are more expensive than that. Almost nobody that's using snowflake costs $3 an hour. The most expensive part of your data warehouse is the human that's interacting with it, either managing that warehouse, managing the queue, or more accurately, writing queries against it. So again, it's sort of like really getting total cost correct. And we've really figured out a way to make the cost equation really work for downstream companies. And our hope is that those companies scale up. Saad Ali (0:45:40) - Yeah, that's interesting. And starting off, a lot of these companies, we talk to them. Them, they are like, you know, leaning towards like, okay, now it's the right time to move towards an infrastructure building out a data warehouse and all of that. What are your thoughts around, like, you know, these new storage, storage engines and, you know, if you already have a database, let's say like, you're using a postgres database. Now, these are like engines that you can use that can provide columnar storage right inside your primary database. So how do you think, like, you know, companies in an early stage, like, should they jump towards them? What does the future look like on that? There's a company, I think that's a YC company called Hydra. They are building a postgres columnar engine that's very fast for analytics. So you don't need to, like, at one point, you are sink. You have data in your database, then you're syncing that back to a data warehouse. If you have both of them at a single place, you don't have data movement, cost and all of that. So what are your thoughts around, you know, this industry? How is this going to move forward? Peter Fishman (0:46:37) - Sure. I actually know just from Hydra. Like, the Weissen community is very small, and I certainly love the thoughts around cost effectiveness. I think that that's always one of our customers most requested kind of features, and they're always trying to be thoughtful about their costs because typically they have to answer to someone. Maybe there's a CFO, but more accurately, typically a founder or CEO in general. Of course, I feel somewhat differently. Right. So almost all Mozart customers are on Mozart managed Snowflake. There's sort of. Now we have two sort of bring your own options. So you can bring your own sort of snowflake or bring your own bigquery. So we do actually allow customers to bring their own. But should they be, should they be using a postgres database as their data warehouse or even new tooling that would make that more powerful? And again, you get to this standard question around data sophistication. If you have a pretty sophisticated data team, you can take a variety of non vanilla approaches. We offer a best in class that most companies, even most really technically savvy and sound companies should be using, or the underlying tooling? It doesn't have to necessarily be Mozart, but we offer a best in class for a vanilla application of using data for reporting and bi and analysis and data activation. So at the highest level, no, I sort of believe, sort of at my core, that this is what the right stack looks like. I mean, it's sort of the stack that I've been using for the last 15 years. Again, new technologies, different names, but it is ones that I really, truly believe in. And I believe in this approach. So it's a little bit. I would just call it tricky. I'm voting with my feet and with my hands and my hands say this is the right way to do it. If you have the resources, that's great. There's many approaches to get to the answer, and we're really obsessed with startups really getting to effective answers and effective answers quickly. I think it's generally a misuse of resources when you have engineers or engineering teams dive into their data pipelines. A company like Fivetran is over 1000 people and has no data engineers. It's a little bit of a cheesy example, and certainly I know a lot of the team there, and of course, they have no data engineers because they're a data engineering service. So they're sort of intentionally doing that. Of the people there are very, very, very skilled at data engineering. So that is not to say that they don't have those chops. They definitely have those chops in house, but they've basically been able to build an enterprise company without data engineering resources. And I would argue that we are at a place in the sophistication of the modern data stack where engineering resources should be put towards your application and figuring out how to get more customers and shouldn't be, you know, working towards more efficiently, you know, building sort of the data stack, which in most cases is a solved problem that I think looks like Mozart, or ideally, you know, for us, is Mozart. Adil Saleh (0:50:26) - Interesting? That's gonna, that's gonna open a lot of conversations when this goes to the web. Saad Ali (0:50:33) - Yeah, that's, that's very interesting. And, you know, that makes total sense. Like, we all also keep on preaching to, you know, like, most of these companies, like, you know, specifically for customer facing functions. Like their team should be focusing more on, you know, taking action rather than focusing on data. So that makes real sense. Like, this data is already being activated. How does that get to, you know, those customers? You are taking that to the data warehouse and, you know, providing best in class data over there. Then, you know, we have these other solutions that will pick them back from the data warehouse, push them back to the reverse ETL solutions, push them back to customer facing teams. So just thinking, what's next on Horizon for Mozart data? What's that thing that you would like to achieve in the next couple of years, and what does that look like? Peter Fishman (0:51:21) - So, before I answer that, I do want to expand on something you just said. I think YC partners talk about this a lot, that one of, uh, the, those many fail cases is when you try to innovate on too many things. So, um, you know, they give sort of like an opposite example of when it actually worked out. So very famously, OpenAI has this, like, very crazy corporate structure. Um, and yet OpenAI, you know, probably is the most successful company of the last, you know, five years. So it's like, well, like, look like they've innovated, they've done, it's like, that's not why OpenAI is successful. OpenAI is successful because. Because they're amazing. And it's not sort of this, like, you know, I think you have this sort of conflation because a lot of, a lot of times, like, great founders try to innovate on many things because they sort of see problems with many. I mean, I mean, part of being a great founder is like seeing a real problem, something that really bugs you. And, you know, you'd be a obsessed with trying to change that in the world. So, you know, one of our sort of, you know, one of the things that, you know, YCPR partners talk about as a failed case is trying to innovate on too many things. And it's like, oh, my God, you know, you're, you know, they said, like, Facebook was initially like a Florida, like, like, LLC or something like, you know, when they should, you know, everybody like us should be like, you know, Delaware sequel or something like that. So I would say that basically, as you have a hard enough time as is, so this leads into sort of the answer to your last question. You have a hard enough time as is trying to really deeply solve a customer problem. It's to find, identify, recognize, sort of the scalability of it and the pattern matching of it, and having the technical wherewithal to actually solve it in a great and cost effective way. That's pretty hard. And we're sort of four years into this very challenging journey, and it doesn't get easier. And then I see friends of mine, many of which who have built way more successful startups that are many times the scale of us. Like, oh, once we get over this one little revenue hump, it'll be pretty easy, right? It was like, no, they look just as stressed, if not more than I am. So I'm like, maybe when you get to the Jeff Bezos scale, when you're on the yacht most of the time, then maybe, then maybe it gets easier. But I think before that scale, I think it just constantly gets harder and harder and harder. I just constantly try to trick myself and say it'll get easier. So the thing that I'm most looking forward to in the next few years at Mozart, is it getting easier in practice? It won't. The thing that we're excited about because we're any company that exists in the world is that AI is here and it's going to create a lot of opportunities. I think it's also going to distract a lot of companies. And by the way, that's going to actually, we think, probably enhance our own opportunity. So as companies get sort of distracted by flashy things and for us in the data space, that's been a lot of, I'll call it like English to SQL or natural language SQL writers and trying to democratize data that way through basically something like that. We at our core actually aren't really big. We aren't really bought in to that solution yet. And we are looking forward to the day when that will happen and that will happen, but it's not here today. But what is here is the ability to leverage AI for, I think, a lot of rote, mundane and boring tasks in data pipelining. That's sort of like maybe cataloging or some variant of that or even column descriptors. We have our own sort of AI product within Mozart that basically will do some of that work that data analysts should be doing, but they really hate and they get really lazy about sort of going in almost the opposite direction. Technical or sequel to English, I think is a really great opportunity for leveraging AI these days and finding these clever applications of AI. But at the end of the day, most of what we'll be doing for the next, whether it's one year or even ten years, is really solving our core problem again, sort of making that onboarding even easier. It's so funny when we look at just how many hours must have been poured into the Airbnb homepage or the Facebook feed page, it's like, geez, we have so much more ground to cover in terms of making spinning the modern data stack up even easier and easier. So that's kind of where we spend most of our cycles. But I think some of the exciting and the hack day cycles do get spent on, you know, the cool technologies. Saad Ali (0:56:29) - Interesting and, you know, mentioned that, you know, focusing on that one key thing and you plan to like you for the next couple of years, you know, keep very focused on a specific problem. I would like to know more about like this, the ETL space. Like, how do you think this is going to go? Because with a lot of other industries, but the trend that we're seeing is like consolidation going on. So like, you know, a lot of these companies, like we have seen like Clari and, you know, salesloft coming in. So they are trying to consolidate like, when there was a space for, like, different specialized solution, now they are trying to group together, you know, an all in one solution. So how do you think, like, specifically in the ETL space? Like, let's say you do ETL very well today. What goes around, like, if a customer that's doing ETL with you, you know, maybe they want to do reverse ETL. So how does that look like for you? Like, would you go towards that angle? Or, like, do you think this is a big enough problem to just, like, stay very focused on? Peter Fishman (0:57:23) - So what you have is this crazy pendulum of, you know, you have to bundle everything and you have to unbundle everything. You have to bundle everything, and you have to unbundle everything. And there is, like, a lot of power in the unbundling. What you have is, like, solutions that are point solutions that are best in class, that, like, somebody has poured their heart and energy into a best in class point solution. But then very quickly, you realize, actually, these things aren't very well integrated. And you need, in this case, I'd argue, a Mozart like solution, although I happen to like Mozart's version of the solution. But you need solutions that actually bring these point solutions together, because what you find is that these sort of best in class solutions are maybe a little bit daunting for the opera. Just nobody can be skilled at everything and sort of bring these all together in all in one solutions, I think is sort of powerful. But what we might find five years from now is, nope, we gotta break apart these sort of all in one solutions. And it's always, like, centralize whatever is decentralized and the other way around. And I think basically we're at the part of the pendulum that's obvious. I sort of, in some sense, that was sort of, some of the vision for Mozart was to say, okay, these best in class solutions have been sort of identified within the modern data stack in a way that I really believe it for the next, say, decade. And we wanted to build sort of on top of that and build sort of a really integrated solution so that people could do, sort of do the hard parts of their job, which is understanding their business, understanding, like, what would move the business forward and not worry about kind of what had sort of been perceived as the hard part, which is, like, setting up all of the technical stack to really empower them to do it, which has now become sort of sufficiently wrote and commoditized thanks to the wonderful advances that we've had in the last decade. Adil Saleh (0:59:21) - Interesting, interesting. So now I would like to know more about what kind of culture you guys are cultivating much on time, just five to six minutes more. You know how I know that your team is around 40 to 50 people at this point, and, you know, what kind of like, like vision or operating principles you guys are driving, you sitting at in the higher office, how you're driving and how you're, you know, trying to transition into across teams and, you know, all these operations. Peter Fishman (0:59:50) - Sure. I mean, I think different cultures, like, can be very successful. So some people have, you know, you know, a nine to nine, six days a week culture, and some people have, you know, real nine to five, five days a week, and you don't even see them on Fridays. And some people have a. We give very direct and honest feedback and, like, harsh feedback, and we fired the bottom x percent. And, like, some people have, you know, we treat everybody like, you know, a family or a sports team or this or that. So I think cultures really sort of emanate from the founders. So there's, you know, I, you know, Dan and I have a few core values that we think are both sort of reflected in our company culture, but also in our product. So one of them that I'll highlight, and this is going to sound bad, but I really, I assure you that it's not. We're a big believer in the Pareto principle, the 80 20 rule. And some people, Facebook's version of this is move fast and break things. You can call it a variety of different things, but the 80 20 rule for us is largely just that. 80% of the value comes with 20% of the work. In our transformation layer, we have a lot of tooling, we have cataloging and observability capabilities and all of these things. But we have a lot of great customers that actually graduate up into a specialized tool that does one of those things. We get 80% of the way there. And for those customers that maybe have real serious, deeper needs in terms of any one of those pieces, we have great partnerships with many of the other players in the data space. So that 80 20 philosophy speaks to some of the products that we build. Some of the products we build are 100 100 because they are mission critical. But some of them we really do believe in going 80 20 on. And that is because, honestly, that's where the value is, that the ratio of ROI is there, that actually, that last mile isn't really worth it for the startup. So 80 20 applies to many things within our company. We want people to be a little, just putting it directly, a little bit frugal about some of the sort of tech excesses, you can basically get about 80% of the value from about 20% of the spend. I used to work at some big tech companies that had some really fancy spend on a lot of things and over the top, you know, parties, etcetera. But you know, honestly getting together with, with your team and a couple cheap bottles of wine ends up being almost as effective as a big blowout night at a fancy club. So. And that's not even 80 20, that's more like 90 ten. So I think like sort of that is one we have, we have a bunch, we have a bunch of core values. That's one of them. Another one that I will highlight. We just call it hack day. So our first employee, John, he and I worked together at Yammer Microsoft and he had won a few hack days and Dan was really excited to work with him because Dan himself at his previous job at Clover had won a few hack days. So we have really, so we do hack day about once a quarter. And of course so much of our actual product roadmap comes out of it now. Very few things get shipped directly from hack day but we have a good healthy competitive hack day where people really get to grow and expand themselves and do great work and it also connects them to the company mission and also the company business in a way that we think is really important. Adil Saleh (1:03:59) - Yeah, one thing I really loved is all these values that here you got to make sure you had make sure that they are connected to the business outcome of business somehow in any shape or form, which is really great. So I really appreciate just last thing, are you guys hiring for any role like in the product team, in the marketing team, sales, revenue, any team? You can mention that real quick. Peter Fishman (1:04:25) - Sure. I mean predominantly our open roles are in engineering. So you know we are trying to grow the product footprint and we obviously have been, you know, really focused on growing our customer base. You know we have a great GTM team but always, always really thinking about sort of marketing and top of funnel as a really great place where I think we can add to our team. Adil Saleh (1:04:52) - Yeah, inspiring, you know, barely. I found people like I can put them on my finger that had this level of energy and you know, the kind of knowledge and the way you communicate it. I really appreciate your time and all that you've shared. Pretty much unfiltered and very much opinionated. I'm sure. You know, I've learned a lot to be very honest before even it has gotten published. So thank you very much for your time. Peter, I really enjoyed it thanks so. Peter Fishman (1:05:18) - Much for having me. Adil Saleh (1:05:20) - Love that. Thank you very much. Sad. Saad Ali (1:05:22) - Yeah. Thank you so much. Thank you, guys. Bye.

Ep # 116: Balancing Innovation and Scalability in B2B Data Integration: Insights and Industry Trends ft. Peter Fishman (CEO & Co-founder, Mozart Data)

Peter fishman

Episode Summary

Transcript

Meet ORA — Your AI Agent for Customer Management

Keep listening

Meet ORA — Your AI Agent for Customer Management