← Previous · All Episodes · Next →
#27. On AI, Data Contracts, and Career Evolution with Mark Freeman Episode 27

#27. On AI, Data Contracts, and Career Evolution with Mark Freeman

· 29:00

|

On AI, Data Contracts, and Career Evolution with Mark Freeman

Sheikh Shuvo: Today's guest is Mark Freeman. Mark started his career as a pre-med student with an interest in sociology, but ultimately he fell in love with research and experiment design, leading him to a career in data science. When he's not producing great content and how to level up your data science game, he's working on building the future of data contracts at Gable AI.

Hey, Mark. Thank you so much for joining.

Mark Freeman: Thanks so much for having me.

Sheikh Shuvo: Well, Mark, the very first question I have for you is, especially as someone who's dabbled in so many different parts of the data world. How do you describe your work to a five-year-old?

Mark Freeman: It depends on what stage of my career. If I had to describe my startup job, right now that would be much more difficult. I would basically describe it as wherever there's a fire or there's an issue, I just sprint towards it and fix it. Cause that's just what startups are about. But if I were to focus more on the data side, the data engineering side, I would talk about the apps they use. So, like, I imagine a lot of young kids use TikTok or YouTube. There's a lot of information brought to you for that. On the data science side, it's which videos do you see first? What's going to keep you most engaged? On the data engineering side, how do I bring data from one point to another point where it's personalized to you? In a really fast way. And so I'm just in the business of thinking about data from all sides. And my career has really reflected that from being a practitioner to the business side, anywhere in between.

Sheikh Shuvo: Like, if you go with the firefighting analogy, you're driving bigger and bigger engines every day.

Mark Freeman: Pretty much. I'm working on the biggest engine I've used yet, and it's a wild journey.

Sheikh Shuvo: Well, Mark, you've had a very different career origin than a lot of data science professionals out there. And, you know, during your undergrad days, you spent a lot of time researching sociology. Could you talk a little bit about that and how that transformed into you becoming a data science professional?

Mark Freeman: I love sociology so much. I was a pre-med student. I went to community college first before going to university. And my major was chemistry. I ended up taking all the sociology classes I needed to transfer on accident because I was just so interested in them and I was talking to my counselor like, hey, you can transfer as a chemistry major for years or you can transfer as a sociology major right now. I was like, that's an easy decision. I'm a sociology major. And through that is actually how I learned how to really work with abstract ideas. Many times in sociology, it's hard to run experiments in that setting because you're working with populations, you're thinking about kind of like these squishy, fuzzy kind of changes to describe our world. Right? And through that, I actually got involved, when I transferred over to UC Davis for university, with doing research in sociology and ended up having my research, it was on the Black Lives Matter movement. That was a big, big kind of thing, looking at media and then college perceptions of it. That was my first time really diving into, like, the research process of not just even doing an assignment for the experimental design, but, like, doing the experimental design, getting the paper ready, going through professors to understand, how the theories apply. presenting at a symposium, and actually got first place at that symposium. I got one of the best research and that was a huge turning point for me. because I'm not a good student. I actually don't like school too much. which is surprising cause I have my master's degree.

But I had a lot of imposter syndrome cause I wasn't a good student. And so me doing research finally made a click in my head.

I'm like, no, Mark, you belong in these academic environments. You can really actually do some incredible work. And from that point, that symposium and that research, which was actually my senior year of college, that really changed my perspective about myself, how I engage with academia and tough topics and really about experimental design.

I became obsessed with research because it was just so much fun for me.

Sheikh Shuvo: Going from those early research days, in working with very sensitive data, what were some of the unique challenges that you had in working through healthcare and sociological information, especially on the experimental design side? It sounds way trickier.

Mark Freeman: It really depends. So right now, looking back, it's definitely intimidating, but because I was pre-med that entire time, I was working in clinics. So I was exposed to HIPAA and like PII, even before I really understood data. And basically, when you work in healthcare, they put the fear of God in you around security and HIPAA compliance. Basically, like, if you mess this up, you go to jail. Which I think is a good thing to have. So like, for years before I even got into data, that was just on my mind. I protect data that is secure. Right. So when it comes down to the experimental design side, actually I would argue for experimental design. You know, PII is a big part of that, but I think it's a much later stage. That's like the data collection stage and data managing stage. When doing experimental design, you're like, how do I manage bias? I'm trying to understand the question. What's the best way for me to answer this question? By controlling everything around it so that way I get the exact thing. And being in sociology, like I said before, you really can't do experiments. There may be natural experiments that you run. But many times you're doing observational studies, and that's where I really specialized in, especially when I did my master's, I did community health prevention research. So bridging the healthcare with the sociology, and for quantitative analysis and through observational studies, I really learned, okay, we have data. It was collected for a different purpose, but we can repurpose it. There are additional steps we need to make to make sure that it's actually viable and you can have a set of power to it. And that's where I really also got really, really, really into data quality, data management. I think I went on a little bit of a tangent. I don't know if I answer your initial question.

Sheikh Shuvo: I love a good tangent, especially when it leads to more geekery. With your pre-med background, outside of the ones focused on research and data, were there any other particular courses that really resonated with you that influenced your career track?

Mark Freeman: That's a good question. I would say grad school, with Dr. Sanjay Basu, was impactful. He's an exceptional researcher. You can look him up. He teaches at Stanford, Harvard, and other places. He taught me how to think critically and frame real-life problems as math problems, as data problems. He had a course, which was on modeling public health. He wrote a book on this, and he walked us through his book. It's about various optimization problems in public health, how to turn them into a model, and into statistical conversions and systems. It's a very systems-thinking approach. That was the first class where I learned how to code in R. That was my first coding language, and that's what kind of kicked off the data thing. That was the first class where I was like, "This is really cool. I want more of this." I got obsessed with how to think about real-world problems as data problems, as math problems. Then how to code it and show that work. It was just the coolest thing ever to me.

Sheikh Shuvo: Do you consider yourself a statistician at all, especially coming from an R background?

Mark Freeman: 100%. Maybe not as much now because I'm not as deep into statistics, but I specialized in observational studies for statistics, and I have a pretty good toolkit for that. In research I am published in, my contributions were the statistics. I was heavy in R, did experimental design for statistics. If you ask me about observational studies and experimental design and statistics questions, I will talk your ear off for a while because I just love it. The first thing I do when I read research is go into the methods section and see what they did.

Sheikh Shuvo: I still remember my very first econometrics course in school. I just totally excelled and I don't have the same passion, but I'm glad that folks like you exist.

Mark Freeman: It's a brain teaser for me. Actually, my first stats class in grad school, I almost failed. I went to the professor and I was like, "Look, I'm not cutting it here. Should I stay?" He was like, "You'll learn, you'll pick it up." And I'm glad I listened to him because by the end of the semester, I was really catching steam, and by the next course that built on top of it, I was really on top of it.

Sheikh Shuvo: When you started branching outside of research into industry, you started your career as a data scientist and eventually got promotions into senior data scientist and then senior data engineer. For people not familiar with the career progressions in that world, could you share sort of what some of the differences between a data scientist and a senior data scientist are, what responsibilities change, how does your lens on your work change?

Mark Freeman: So I think the big thing, and I think this can apply to a lot of technical roles in general, is that typically you'll see this junior level, then you'll have the senior level, and then you have the staff level. At the junior level, you're more so just expected to say like, "Hey, you're coming in. We're going to assign you tasks. You're going to get it done. Make sure it's to a level degree of correctness," and things are just kind of laid out for you, and you just execute, right? But moving to that senior side is when you move into a lot of ambiguity. My boss would be like, "This is the goal. Figure it out." You're getting a lot more autonomy, and that transition from the entry to the senior level is working with that level of autonomy. But what really cements you in that senior role is your ability to have an impact, not just with you and your manager and single projects, but for your entire team. And ideally, other teams as well. And so how can you work across the business to identify opportunities, scope out what needs to be done to drive that value, and then execute it. And then going to this other side, which is a little bit above, which is where I'm kind of entering in, for this kind of staff side, and maybe it's my own imposter syndrome a little bit, being unwilling to kind of hold that kind of title myself, even though that's kind of like my title externally, is more so like, how do you oversee these massive, larger projects, and start delegating to people and mentoring people to drive this strategy, drive this kind of value, this value with data, as a team as well. And that's a completely different skill set that I'm still learning myself. There's a lot of great articles from much smarter people to describe that.

Sheikh Shuvo: What are some of the resources and tools that you use to level up your game in that regard then?

Mark Freeman: Unfortunately, you can't repeat it now, but shout out to Harpreet Sahota, host of The Data Science Podcast. Every week he had a weekly kind of all-hands call with a bunch of people, and that's where I met a lot of people who are now mentors. Every week I would show up with questions from things I was dealing with at my job and bring it to this group of people for a group chat. You can probably find the recordings out there. And actually, that's how I made the transition to data engineering. On those calls was Joe Reese, who was the author, one of the co-authors of "Fundamentals of Data Engineering," and I would actually end up asking him questions all the time on LinkedIn, and he would respond. And he's the reason I got into data engineering because all the problems I was trying to solve as a data scientist were actually data engineering problems. So I would go to that group, ask questions, get help, come back. I think the biggest thing is to find a group of peers and also a group of people who are way more advanced than you and just talk all the time. It's why like conferences and meetups are so important because you can talk to people who have a different perspective on the problems you're facing. And they give you a little sneak peek like, "Oh yeah, I experienced this before. This is how I would approach it." Many times, the problems you face as a technical person, the technical problems are actually easy. Stack Overflow, now ChatGPT, you're smart. You can learn that yourself, but it's applying the technical skills to business. That's where a lot of people, it's like really unknown. And that's where I was getting a lot of advice because to go from that entry to senior to staff level is not your technical skills. Your technical skills get you the job. It's your impact on the business and applying those technical skills in unique ways that get you those promotions.

Sheikh Shuvo: Speaking of finding a group of peers, one of the things that you've done is built a really incredible community of data professionals via the On The Mark Data group. What inspired you to create that?

Mark Freeman: So On The Mark Data was a complete accident. I started posting on LinkedIn just because I wanted a job. I think a lot of people were like that. But for me, I'm obsessed with funnels and sales funnels. And so when I post on LinkedIn, I was laid off and I was like, "I don't want to apply to jobs. I want jobs to apply to me." And so I'm going to start creating content with the goal of getting attention from my future employer, and eventually, they're going to reach out. And that's how I got my last job at Humu. And so I got my current job at Gable. And through that, I just started building up an audience, and one day a company reached out to me and said, "Hey, we want you to create a course for us." And me being the nerd I am, I was like, "Yeah, sure." I was going to do it for free because I just love this stuff. And they're like, "Well, what's your price?" And I was like, "Wait, people want to pay me for this?" That's wild. Right. And so to get paid, I was like, "I need an LLC." And so I created an LLC, but what really pushed me to take it seriously as a business was before this, I had tried building startups before, and the last startup I built, we made it to building an MVP and interviewing for potential funding. But we ultimately decided it was a bad idea not to pursue it. And I just saw the gaps I had as a business professional and as a founder. And so I figured I could use On The Mark Data as a low-stakes way to further build up my business skills and my founder skills. So when it's your business, you have to manage the money for it. You have to set up the strategy for it. You have to execute on that strategy all while holding a full-time job. And I would argue it was one of the best decisions I made. Because specifically, it taught me how to do sales really well. And being in a startup now, I do a lot of sales stuff from a technical perspective, but I got that experience from On The Mark Data where I was doing sales calls with marketing teams, or I was doing sales calls with executives who wanted to do go-to-market stuff. And through that, I learned how to have my whole sales process and how to get people hooked and how to find the right customers. And so, it accomplished my goal of really picking up my business skills, and I still have it now as an outlet for me to grow in the future, but we'll see.

Sheikh Shuvo: As you were starting to create that and scale things out, what were some of the earlier inflection points that let you know, "Hey, I'm on the right track here?"

Mark Freeman: Definitely, I think the biggest thing was joining Kenji, who is a very popular data science YouTuber and a good friend of mine. He started a media agency, and they give me brand deals to work with people. And about once a year, we do a content creator meetup where everyone in the agency comes to a single house, kind of like a hacker house or a content creator house, and you create a bunch of content for a week. And I met all of my heroes, and I was like, "This is the coolest thing ever." And two things happened. One, first, I saw my weight. I'm in the same room as these people. This is wild. But two, I saw people who are full-time content creators who are making pretty good money doing this, just creating content, being creative. And that's when it became clear to me. I'm like, "Oh, this isn't some side hustle fun thing. Mark, you have an opportunity here. You should really capitalize on it." And that's when I really started getting serious, and it showed in my revenue. From that date, all of a sudden, it's just picking up over and over again, month by month. And then I joined a startup as the first employee, and it chopped right back down. But that was the biggest inflection point of seeing, like, this content thing isn't just fun and games. There's actual real business and money to make here. And if you take it seriously and drive a strategy with it, you can really do some amazing things.

Sheikh Shuvo: In terms of your content strategy, then, are there any particular areas you draw most inspiration from as you're thinking about new types and mediums?

Mark Freeman: So, it's less about mediums and more about the messaging. Because I think the medium just depends on what you resonate with as an individual. So whether it's video or writing, but what's more important is understanding who your audience is and what messaging resonates with them. I spend so much time doing interviews and one-on-ones with so many people all the time because I'm trying to soak up how they're thinking about problems, what's important to them, what's going to make them tick and really trying to engage. That's what I'm obsessed with. And so now, when I create content, my audience is data practitioners who are on the more senior side and also leaders, and connecting those two audiences. Right? And so I just focus deeply on what are the challenges of practitioners selling up to leaders? And what are the challenges of leaders engaging with practitioners to understand what's happening on the ground? And by focusing on that message over and over again, you really start to build a niche. People see you as that person to listen to. And that's how you really start building an audience. It's that repetition. Anyone can do this. It just takes a long time and a lot of effort.

Sheikh Shuvo: The ten-year overnight success story.

Mark Freeman: Yeah. I'm like, what, I think four years into this, and I feel like I'm just now starting to get that inflection point.

Sheikh Shuvo: Looking over your newsletter, a term that was new to me is GTM engineering. As a sales and business development guy, I think of GTMs all the time, but can you share a bit more about what GTM engineering is and how that's different from other GTM approaches?

Mark Freeman: It's a completely made-up term. But the founder of the startup I joined, the CEO, Chad Sanderson, he has a large audience, and I have a large audience. And so we have a unique problem where our problem isn't finding attention for the startup. Our problem isn't getting leads. Our problem is that we're getting millions of impressions a month and lots of people messaging us, how do we optimize on who are the right people to talk to when you're a lean startup team? And what we found is that, instead of being a marketer or a salesperson, I'm an engineer who still happens to get into that world, and that works in my favor when I work with developers because they want to talk to other developers. And what we found was that if we take an engineering-based approach to go to market, thinking about the underlying data behind it, the underlying systems behind it, and really approaching it like doing sprints for it, you can actually create a really strong system to drive your funnel, to bring attention to it and understand your expectations. And so I argue that go-to-market engineering, the definition I'm coming up with right now, is using engineering best practices and systems thinking to get your first set of customers and drive your go-to-market adoption for a company.

Sheikh Shuvo: Along those lines, tell us more about what you're cooking up at Gable.

Mark Freeman: Gable's probably been the wildest journey I've been on so far. So around last year, actually, it's been about a year, Chad Sanderson, I interviewed him on a newsletter and he was like, "Mark, it seems like you're looking for a new job. I'm about to start a startup. Why don't you leave your job and join me?" And the next day I said, "Yep, I'm quitting." And I quit my job the next day, put in my two-week notice, and basically, I joined the startup officially in February. By startup, you work kind of in between that. And so I joined the three founders, specifically on this concept of data contracts, which is essentially an API-like agreement between data producers and data consumers. So think like software engineers and the database and data scientists in a data warehouse, there is a transfer of data between both sides and they use data vastly differently from each other. And so there's this miscommunication on how data is used, which leads to data quality issues. And those data quality issues prevent you from actually leveraging data the way you want to. Data contracts serve as a way to bridge that gap between each other by putting these data quality checks in the CI/CD process, as well as preventing issues before it's even merged into the code. And so Gable specifically enables data contracts. And more importantly, it's a collaboration platform between all stakeholders in the data lifecycle. The reason why I quit my job the next day to focus on it is two things: one, I experienced the pain in my data roles, and so I viscerally knew this is a problem, and if it was solved, it'd be huge. The second thing was, like I said before, my last time trying to build a company, there were some mistakes. And one of the mistakes was I didn't do enough user interviews, and that's what we ended up with a bad idea. Chad and the team did over 200 user interviews. By the time I talked to them, they're now like a thousand plus user interviews. It's actually insane. And I saw that and I was like, "You understand the problem really well. I'm confident. Whatever we build, you understand the problem space so well that we're going to build the right thing." And so it was easy for me to jump and make that transition.

Sheikh Shuvo: Now, just with the data contracts and managing that entire lifecycle, does it touch on data licensing as well and interactions with outside vendors that you're consuming the data from?

Mark Freeman: That's a really good question. So I think the main thing when it comes to data contracts is you have to have leverage between both parties. So working with external data, you don't have leverage from people you're consuming data from, right? You may put a contract in place to prevent issues once it's already landed. But it's more important that both parties are present and can have a conversation with each other. Regarding external data, if you're sending your data out to other people, say, for instance, you have a data product, data contracts are extremely powerful there in a way that you can provide trust in your data. So we say this data product or this data asset is under contract and by contract, we don't mean in a legal sense. We mean like these are expectations that we're going to adhere to, and data contracts programmatically enforce those. So it's like, "Hey, this column needs to be numeric. Here are the semantics for this column and we don't expect deviations for this column." That's extremely powerful, especially with ML, where those small changes can wreak havoc or break systems, or even for dashboards. One of the key things is one, both parties need to be able to be there to enforce the contract on both sides. But if one party can't be there, you can still have those checks for the ingestion of it. And then for serving it, it's like a badge saying, "Hey, this is under contract. We believe this is high-value data and you can trust this other compared to sources that aren't under contract."

Sheikh Shuvo: With the popularity and recent talks of regulation on large language models in particular, do you feel there's a use case there for data contracts and helping inform the user experience on the trustworthiness?

Mark Freeman: Absolutely. So like I said, on the go-to-market side, I've been doing a lot of sales calls, wearing many hats in a startup. And the calls I have the most with are European companies because they understand data governance and they care about data security and GDPR. And so, taking a step back just from AI in general, like data security and data governance, we're essentially building a data governance as code solution. Many times when people talk about data governance, they have these amazing kind of slides and frameworks and understanding the business. Right. But then the challenge was, well, then we have all these rules. We have all these understandings. How do you enforce this at scale? And so data contracts enable these data governance people to actually enforce their amazing ideas at scale. And then regarding AI, right? Is, again, I'm more so brainstorming now because now we're going to like this new charter territory. People are still figuring it out, right? But conceptually, the big reason I pushed into data engineering was Andrew Ng talking about data-centric AI, where essentially it's for model-centric AI, which people focused on like the 2010s, right? You put a stronger model, and then you get these better results. Well, now, you give better data with maybe weaker models, and you have way better results. And so, thinking about the accuracy of your AI or the reliability of your AI, data quality is paramount for that. And you can put controls in place before the data even reaches the AI model. I think that's extremely powerful, and regulation and compliance for AI, especially saying this AI model uses data that is under contract with these constraints. And these are expectations and will not deviate. And if it does, we'll know ahead of time. I think it's extremely powerful for AI.

Sheikh Shuvo: Along those lines, just looking at all the broad developments in the AI space, are there any particular areas of it on the research and development side that you follow closely?

Mark Freeman: I think another person I love following is Harpreet Sahota. He has actually dived into the L.M. and the M.O.O.P.S. community. They focus on L.O.M.O.P.S. So that's how I learned about the RAG architecture, for example, for L.M.s. I highly recommend checking out those folks. Also, madewithml.com. That's an amazing resource to learn how to build these ML tools. And the OpenAI folks, they've been exceptional. I think it's part of their strategy to get as many developers as possible involved. But regarding research, you know, I'm not reading the research papers because again, I'm in the data engineering space rather than the AI space. But what I do focus on is what people are building, what are some really cool GitHub repos, how are they doing it? Can I build it myself over a weekend? I think that's where a lot of the learning comes from. If you're a practitioner, find really cool projects and build something. That's going to be the best way you learn. And then if you're a leader, find people who are trusted voices who are synthesizing this information. One of my favorite people to listen to is Vin Vashista. He wrote the book "From Data to Profit." Find those people who are really invested and credible. There's a difference between popular and credible.

Sheikh Shuvo: And the very last question I have for you, Mark, is if you could travel back in time and go back to college right now, what would you change about the types of things you studied, if any?

Mark Freeman: I'm actually really happy with what I studied. I'm happy I don't come from a traditional STEM background, coming from sociology, because that gave me really the writing skills I use now. And then my master's in community health and prevention research gave me the statistics skills but also a really unique perspective about applying these research skills. I would actually double down on what I did. I think going to a domain and learning the data skills for that specific domain is way more powerful. And I would advise any college student who's stressing over finals to chill out. I got straight-up failed classes and got C's, and somehow made it to great schools and great jobs. It's not the end of the world. Enjoy yourself.

Sheikh Shuvo: Just chill should be the official motto on your website then.

Mark Freeman: Yeah, I think that should be. I'm going to put it on my LinkedIn like chill.

Sheikh Shuvo: Awesome, Mark. Well, thank you so much for sharing yourself and your experiences. I'm really excited to see what you're cooking up next.

Mark Freeman: Thank you so much for having me. It was a great time.

View episode details


Subscribe

Listen to Humans of AI using one of many popular podcasting apps or directories.

Spotify
← Previous · All Episodes · Next →