Other

PTC Windchill Success Story: The Benefits of Moving from PDM to PLM

PTC Windchill Success Story: The Benefits of Moving from PDM to PLM A Prominent Furniture Manufacturer deploys Fishbowl’s System Generated Drawing Automation to Increase Efficiencies with their Enterprise Part deployment within PTC Windchill

Our client has numerous global manufacturing facilities and is using PTC Windchill to streamline eBOM and mBOM processes. However, not all modifications to parts information propagates automatically/accurately at the drawing level. Updating plant specific drawings with enterprise part information was a time-consuming process that was manual, error prone, full of delays and diverted valuable engineering resources away from their value-added work.

The client desired a go-forward approach with their Windchill PLM implementation that would automatically update this critical enterprise part information. They became aware of our System Generated Drawing solution from a presentation at PTC LiveWorx. From the time of first contact the Fishbowl Solutions team worked to deliver a solution that helped them realize their vision.

BUSINESS PROBLEMS
  • Manufacturing waste due to ordering obsolete or incorrect parts
  • Manufacturing delays due to drawing updates needed for non-geometric changes – title block, lifecycle, BOM, as well as environmental/regulatory compliance markings, variant designs, etc.
  • Manually updating product drawings with plant specific parts information took away valuable engineering time
SOLUTION HIGHLIGHTS
  • Fishbowl’s System Generated Drawing Automation Systematically combines data from BOM, CAD, Drawing/Model, Part Attributes and enterprise resource planning (ERP) systems
  • Creates complete, static views of drawings based on multiple event triggers
  • Creates a template-based PDF that is overlaid along with the CAD geometry to produce a final document that can be dynamically stamped along with applicable lifecycle and approval information
  • Real-time watermarking on published PDFs
RESULTS

Increased accuracy of enterprise parts information included on drawings reduced product manufacturing waste
Allowed design changes to move downstream quickly, allowing a increase in design to manufacturing operational efficiencies

 

“Fishbowl’s System Generated Drawing Automation solution is the linchpin to our enterprise processes. It provides us with an automated method to include, update and proliferate accurate parts information throughout the business. This automation has in turn led to better data integrity, less waste, and more process efficiencies.” -PTC Windchill Admin/Developer

 

For more information about Fishbowl’s solution for System Generated Drawing Automation Click Here

The post PTC Windchill Success Story: The Benefits of Moving from PDM to PLM appeared first on Fishbowl Solutions.

Categories: Fusion Middleware, Other

Webinar: Improve WebCenter Portal Performance by 30% and get out of Oracle ADF Development Hell

DATE: Thursday, March 30th
TIME: 12:00 PM CST, 1:00 PM EST

Jerry AberJoin Fishbowl’s Enterprise Architect, Jerry Aber, as he shares recommendations on performance improvements for WebCenter-based portals. Jerry has been delivering portal projects for over 15 years, and has been instrumental in developing a technology framework and methodology that provides repeatable and reusable development patterns for portal deployments and their ongoing administration and management. In this webinar, Jerry will share how leveraging modern web development technologies like Oracle JET, instead of ADF taskflows, can dramatically improve the performance of a portal – including the overall time to load the home page, as well as making content or stylistic changes.

Jerry will also share how to architect a portal implementation to include a caching layer that further enhances performance. These topics will all be backed by real world customer metrics Jerry and Fishbowl team have seen through numerous, successful customer deployments.

If you are a WebCenter Portal administrator and are frustrated with challenges of improving your ADF-centric portal, this webinar is for you. Come learn how to overhaul the ADF UI, which will lead to less development complexities and ensure more happy users.

Register today. 

New to Zoom? Go to zoom.us/test to ensure you can access the webinar.

The post Webinar: Improve WebCenter Portal Performance by 30% and get out of Oracle ADF Development Hell appeared first on Fishbowl Solutions.

Categories: Fusion Middleware, Other

Cloudera’s Data Science Workbench

DBMS2 - Sun, 2017-03-19 19:41

0. Matt Brandwein of Cloudera briefed me on the new Cloudera Data Science Workbench. The problem it purports to solve is:

  • One way to do data science is to repeatedly jump through the hoops of working with a properly-secured Hadoop cluster. This is difficult.
  • Another way is to extract data from a Hadoop cluster onto your personal machine. This is insecure (once the data arrives) and not very parallelized.
  • A third way is needed.

Cloudera’s idea for a third way is:

  • You don’t run anything on your desktop/laptop machine except a browser.
  • The browser connects you to a Docker container that holds (and isolates) a kind of virtual desktop for you.
  • The Docker container runs on your Cloudera cluster, so connectivity-to-Hadoop and security are handled rather automagically.

In theory, that’s pure goodness … assuming that the automagic works sufficiently well. I gather that Cloudera Data Science Workbench has been beta tested by 5 large organizations and many 10s of users. We’ll see what is or isn’t missing as more customers take it for a spin.

1. Recall that Cloudera installations have 4 kinds of nodes. 3 are obvious:

  • Hadoop worker nodes.
  • Hadoop master nodes.
  • Nodes that run Cloudera Manager.

The fourth kind are edge/gateway nodes. Those handle connections to the outside world, and can also run selected third-party software. They also are where Cloudera Data Science Workbench lives.

2. One point of this architecture is to let each data scientist run the languages and tools of her choice. Docker isolation is supposed to make that practical and safe.

And so we have a case of the workbench metaphor actually being accurate! While a “workbench” is commonly just an integrated set of tools, in this case it’s also a place for you to use other tools your personally like and bring in.

Surely there are some restrictions as to which tools you can use, but I didn’t ask for those to be spelled out.

3. Matt kept talking about security, to an extent I recall in almost no other analytics-oriented briefing. This had several aspects.

  • As noted above, a lot of the hassle of Hadoop-based data science relates to security.
  • As also noted above, evading the hassle by extracting data is a huge security risk. (If you lose customer data, you’re going to have a very, very bad day.)
  • According to Matt, standard uses of notebook tools such as Jupyter or Zeppelin wind up having data stored wherever code is. Cloudera’s otherwise similar notebook-style interface evidently avoids that flaw. (Presumably, it you want to see the output, you rerun the script against the data store yourself.)

4. To a first approximation, the target users of Cloudera Data Science Workbench can be characterized the same way BI-oriented business analysts are. They’re people with:

  • Sufficiently good quantitative skills to do the analysis.
  • Sufficiently good computer skills to do SQL queries and so on, but not a lot more than that.

Of course, “sufficiently good quantitative skills” can mean something quite different in data science than it does for the glorified arithmetic of ordinary business intelligence.

5. Cloudera Data Science Workbench doesn’t have any special magic in parallelization. It just helps you access the parallelism that’s already out there. Some algorithms are easy to parallelize. Some libraries have parallelized a few algorithms beyond that. Otherwise, you’re on your own.

6. When I asked whether Cloudera Data Science Workbench was open source (like most of what Cloudera provides) or closed source (like Cloudera Manager), I didn’t get the clearest of answers. On the one hand, it’s a Cloudera-specific product, as the name suggests; on the other, it’s positioned as having been stitched together almost entirely from a collection of open source projects.

Categories: Other

Welcome to the new Fishbowl Solutions Blog

Out with the old and in with the new.  Welcome to the new home of the Fishbowl Solutions blog! Please enjoy upgraded functionality and integration with our website.  Check back often for new and exciting posts form our talented staff.  If you want automatic updates click the subscribe link to the right and be notified whenever a new post appears.

 

 

 

 

 

 

 

The post Welcome to the new Fishbowl Solutions Blog appeared first on Fishbowl Solutions.

Categories: Fusion Middleware, Other

Introduction to SequoiaDB and SequoiaCM

DBMS2 - Sun, 2017-03-12 13:19

For starters, let me say:

  • SequoiaDB, the company, is my client.
  • SequoiaDB, the product, is the main product of SequoiaDB, the company.
  • SequoiaDB, the company, has another product line SequoiaCM, which subsumes SequoiaDB in content management use cases.
  • SequoiaDB, the product, is fundamentally a JSON data store. But it has a relational front end …
  • … and is usually sold for RDBMS-like use cases …
  • … except when it is sold as part of SequoiaCM, which adds in a large object/block store and a content-management-oriented library.
  • SequoiaDB’s products are open source.
  • SequoiaDB’s largest installation seems to be 2 PB across 100 nodes; that includes block storage.
  • Figures for DBMS-only database sizes aren’t as clear, but the sweet spot of the cluster-size range for such use cases seems to be 6-30 nodes.

Also:

  • SequoiaDB, the company, was founded in Toronto, by former IBM DB2 folks.
  • Even so, it’s fairly accurate to view SequoiaDB as a Chinese company. Specifically:
    • SequoiaDB’s founders were Chinese nationals.
    • Most of them went back to China.
    • Other employees to date have been entirely Chinese.
    • Sales to date have been entirely in China, but SequoiaDB has international aspirations
  • SequoiaDB has >100 employees, a large majority of which are split fairly evenly between “engineering” and “implementation and technical support”.
  • SequoiaDB’s marketing (as opposed to sales) department is astonishingly tiny.
  • SequoiaDB cites >100 subscription customers, including 10 in the global Fortune 500, a large fraction of which are in the banking sector. (Other sectors mentioned repeatedly are government and telecom.)

Unfortunately, SequoiaDB has not captured a lot of detailed information about unpaid open source production usage.

While I usually think that the advantages of open source are overstated, in SequoiaDB’s case open source will have an additional benefit when SequoiaDB does go international — it addresses any concerns somebody might have about using Chinese technology.

SequoiaDB’s technology story starts:

  • SequoiaDB is a layered DBMS.
  • It manages JSON via update-in-place. MVCC (Multi-Version Concurrency Control) is on the roadmap.
  • Indexes are B-tree.
  • Transparent sharding and elasticity happen in what by now is the industry-standard/best-practices way:
    • There are many (typically 4096) logical partitions, many of which are assigned to each physical partition.
    • If the number of physical partitions changes, logical partitions are reassigned accordingly.
  • Relational OLTP (OnLine Transaction Processing) functionality is achieved by using a kind of PostgreSQL front end.
  • Relational batch processing is done via SparkSQL.
  • There also is a block/LOB (Large OBject) storage engine meant for content management applications.
  • SequoiaCM boils down technically to:
    • SequoiaDB, which is used to store JSON metadata about the LOBs …
    • … and whose generic-DBMS coordination capabilities are also used over the block/LOB engine.
    • A Java library focused on content management.

SequoiaDB’s relationship with PostgreSQL is complicated, but as best I understand SequoiaDB’s relational operations:

  • SQL parsing, optimization, and so on rely mainly on PostgreSQL code. (Of course, there are some hacks, such as to the optimizer’s cost functions.)
  • Actual data storage is done via SequoiaDB’s JSON store, using PostgreSQL Foreign Data Wrappers. Each record goes in a separate JSON document. Locks, commits and so on — i.e. “write prevention” :) — are handled by the JSON store.
  • PostgreSQL’s own storage engine is actually part of the stack, but only to manage temp space and the like.

PostgreSQL stored procedures are already in the SequoiaDB product. Triggers and referential integrity are not. Neither, so far as I can tell, are PostgreSQL’s datatype extensibility capabilities.

I neglected to ask how much of that remains true when SparkSQL is invoked.

SequoiaDB’s use cases to date seem to fall mainly into three groups:

  • Content management via SequoiaCM.
  • “Operational data lakes”.
  • Pretty generic replacement of legacy RDBMS.

Internet back-ends, however — and this is somewhat counter-intuitive for an open-source JSON store — are rare, at least among paying subscription customers. But SequoiaDB did tell me of one classic IoT (Internet of Things) application, with lots of devices “phoning home” and the results immediately feeding a JSON-based dashboard.

To understand SequoiaDB’s “operational data lake” story, it helps to understand the typical state of data warehousing at SequoiaDB’s customers and prospects, which isn’t great:

  • 2-3 years of data, and not all the data even from that time period.
  • Only enough processing power to support structured business intelligence …
  • … and hence little opportunity for ad-hoc query.

SequoiaDB operational data lakes offer multiple improvements over that scenario:

  • They hold as much relational data as customers choose to dump there.
  • That data can be simply copied from operational stores, with no transformation.
  • Or if data arrives via JSON — from external organizations or micro-services as the case may be — the JSON can be stored unmodified as well.
  • Queries can be run straight against this data soup.
  • Of course, views can also be set up in advance to help with querying.

Views are particularly useful with what might be called slowly changing schemas. (I didn’t check whether what SequoiaDB is talking about matches precisely with the more common term “slowly changing dimensions”.) Each time the schema changes, a new table is created in SequoiaDB to receive copies of the data. If one wants to query against the parts of the database structure that didn’t change — well, a view can be establish to allow for that.

Finally, it seems that SequoiaCM uses are concentrated in what might be called “security and checking-up” areas, such:

  • Photographs as part of an authentication process.
  • Video of in-person banking transactions, both for fraud prevention and for general service quality assurance.
  • Storage of security videos (for example from automated teller machines).

SequoiaCM deals seem to be bigger than other SequoiaDB ones, surely in part because the amounts of data managed are larger.

Categories: Other

One bit of news in Trump’s speech

DBMS2 - Tue, 2017-02-28 23:26

Donald Trump addressed Congress tonight. As may be seen by the transcript, his speech — while uncharacteristically sober — was largely vacuous.

That said, while Steve Bannon is firmly established as Trump’s puppet master, they don’t agree on quite everything, and one of the documented disagreements had been in their view of skilled, entrepreneurial founder-type immigrants: Bannon opposes them, but Trump has disagreed with his view. And as per the speech, Trump seems to be maintaining his disagreement.

At least, that seems implied by his call for “a merit-based immigration system.”

And by the way — Trump managed to give a whole speech without saying anything overtly racist. Indeed, he specifically decried the murder of an Indian-immigrant engineer. By Trump standards, that counts as a kind of progress.

Categories: Other

Coordination, the underused “C” word

DBMS2 - Tue, 2017-02-28 22:34

I’d like to argue that a single frame can be used to view a lot of the issues that we think about. Specifically, I’m referring to coordination, which I think is a clearer way of characterizing much of what we commonly call communication or collaboration.

It’s easy to argue that computing, to an overwhelming extent, is really about communication. Most obviously:

  • Data is constantly moving around — across wide area networks, across local networks, within individual boxes, or even within particular chips.
  • Many major developments are almost purely about communication. The most important computing device today may be a telephone. The World Wide Web is essentially a publishing platform. Social media are huge. Etc.

Indeed, it’s reasonable to claim:

  • When technology creates new information, it’s either analytics or just raw measurement.
  • Everything else is just moving information around, and that’s communication.

A little less obvious is the much of this communication could be alternatively described as coordination. Some communication has pure consumer value, such as when we talk/email/Facebook/Snapchat/FaceTime with loved ones. But much of the rest is for the purpose of coordinating business or technical processes.

Among the technical categories that boil down to coordination are:

  • Operating systems.
  • Anything to do with distributed computing.
  • Anything to do with system or cluster management.
  • Anything that’s called “collaboration”.

That’s a lot of the value in “platform” IT right there. 

Meanwhile, in pre-internet apps:

  • Some of the early IT wins were in pure accounting and information management. But a lot of the rest were in various forms of coordination, such as logistics and inventory management.
  • The glory days of enterprise apps really started with SAP’s emphasis on “business process'”. (“Business process reengineering” was also a major buzzword back in the day.)

This also all fits with the “route” part of my claim that “historically, application software has existed mainly to record and route information.”

And in the internet era:

  • “Sharing economy” companies, led by Uber and Airbnb, have created a lot more shareholder value than the most successful pure IT startups of the era.
  • Amazon, in e-commerce and cloud computing alike, has run some of the biggest coordination projects of all.

This all ties into one of the key underlying subjects to modern politics and economics, namely the future of work.

  • Globalization is enabled by IT’s ability to coordinate far-flung enterprises.
  • Large enterprises need fewer full-time employees when individual or smaller-enterprise contractors are easier to coordinate. (It’s been 30 years since I drew a paycheck from a company I didn’t own.)
  • And of course, many white collar jobs are being entirely automated away, especially those that can be stereotyped as “paper shuffling”.

By now, I hope it’s clear that “coordination” covers a whole lot of IT. So why do I think using a term with such broad application adds any clarity? I’ve already given some examples above, in that:

  • “Coordination” seems clearer than “communication” when characterizing the essence of distributed computing.
  • “Coordination” seems clearer than “communication” if we’re discussing the functioning of large enterprises or of large-enterprise-substitutes.

Further — even when we focus on the analytic realm, the emphasis on “coordination” has value. A big part of analytic value comes in determining when to do something. Specifically that arises when:

  • Analytics identifies a problem that just occurred, or is about to happen, allowing a timely fix.
  • Business intelligence is using for monitoring, of impending problems or otherwise, as a guide to when action is needed.
  • Logistics of any kind get optimized.

I’d also say that most recommendation/personalization fits into the “coordination” area, but that’s a bit more of a stretch; you’re welcome to disagree.

I do not claim that analytics’ value can be wholly captured by the “coordination” theme. Decisions about whether to do something major — or about what to do — are typically made by small numbers of people; they turn into major coordination exercises only after a project gets its green light. But such cases, while important, are pretty rare. For the most part, analytic results serve as inputs to business processes. And business processes, on the whole, typically have a lot to do with coordination.

Bottom line: Most of what’s valuable in IT relates to communication or coordination. Apparent counterexamples should be viewed with caution.

Related links

Categories: Other

There’s no escape from politics now

DBMS2 - Wed, 2017-02-01 23:31

The United States and consequently much of the world are in political uproar. Much of that is about very general and vital issues such as war, peace or the treatment of women. But quite a lot of it is to some extent tech-industry-specific. The purpose of this post is outline how and why that is.

For example:

  • There’s a worldwide backlash against “elites” — and tech industry folks are perceived as members of those elites.
  • That perception contains a lot of truth, and not just in terms of culture/education/geography. Indeed, it may even be a bit understated, because trends commonly blamed on “trade” or “globalization” often have their roots in technological advances.
  • There’s a worldwide trend towards authoritarianism. Surveillance/ privacy and censorship issues are strongly relevant to that trend.
  • Social media companies are up to their neck in political considerations.

Because they involve grave threats to liberty, I see surveillance/privacy as the biggest technology-specific policy issues in the United States. (In other countries, technology-driven censorship might loom larger yet.) My views on privacy and surveillance have long been:

  • Fixing the legal frameworks around information use is a difficult and necessary job. The tech community should be helping more than it is.
  • Until those legal frameworks are indeed cleaned up, the only responsible alternative is to foot-drag on data collection, on data retention, and on the provision of data to governmental agencies.

Given the recent election of a US president with strong authoritarian tendencies, that foot-dragging is much more important than it was before.

Other important areas of technology/policy overlap include:

  • The new head of the Federal Communications Commission is hostile to network neutrality. (Perhaps my compromise proposal for partial, market-based network neutrality should get another look some day.)
  • There’s a small silver lining in Trump’s attacks on free trade; the now-abandoned (at least by the US) Trans-Pacific Partnership had gone too far on “intellectual property” rights.
  • I’m a skeptic about software patents.
  • Government technology procurement processes have long been broken.
  • “Sharing economy” companies such as Uber and Airbnb face a ton of challenges in politics and regulation, often on a very local basis.

And just over the past few days, the technology industry has united in opposing the Trump/Bannon restrictions on valuable foreign visitors.

Tech in the wider world

Technology generally has a huge impact on the world. One political/economic way of viewing that is:

  • For a couple of centuries, technological advancement has:
    • Destroyed certain jobs.
    • Replaced them directly with a smaller number of better jobs.
    • Increased overall wealth, which hopefully leads to more, better jobs in total.
  • Over a similar period, improvements in transportation technology have moved work opportunities from richer countries to poorer areas (countries or colonies as the case may be). This started in farming and extraction, later expanded to manufacturing, and now includes “knowledge workers” as well.
  • Both of these trends are very strong in the current computer/internet era.
  • Many working- and middle-class people in richer countries now feel that these trends are leaving them worse off.
    • To some extent, they’re confusing correlation and causality. (The post-WW2 economic boom would have slowed no matter what.)
    • To some extent, they’re ignoring the benefits of technology in their day to day lives. (I groan when people get on the internet to proclaim that technology is something bad.)
    • To some extent, however, they are correct.

Further, technology is affecting how people relate to each other, in multiple ways.

  • This is obviously the case with respect to cell phones and social media.
  • Also, changes to the nature of work naturally lead to changes in the communities where the workers live.

For those of us with hermit-like tendencies or niche interests, that may all be a net positive. But others view these changes less favorably.

Summing up: Technology induces societal changes of such magnitudes as to naturally cause (negative) political reactions.

And in case you thought I was exaggerating the political threat to the tech industry …

… please consider the following quotes from Trump’s most powerful advisor, Steve Bannon:

The “progressive plutocrats in Silicon Valley,” Bannon said, want unlimited ability to go around the world and bring people back to the United States. “Engineering schools,” Bannon said, “are all full of people from South Asia, and East Asia. . . . They’ve come in here to take these jobs.” …

“Don’t we have a problem with legal immigration?” asked Bannon repeatedly.

“Twenty percent of this country is immigrants. Is that not the beating heart of this problem?”

Related links

I plan to keep updating the list of links at the bottom of my post Politics and policy in the age of Trump.

Categories: Other

Politics and policy in the age of Trump

DBMS2 - Wed, 2017-02-01 23:28

The United States presidency was recently assumed by an Orwellian lunatic.* Sadly, this is not an exaggeration. The dangers — both of authoritarianism and of general mis-governance — are massive. Everybody needs in some way to respond.

*”Orwellian lunatic” is by no means an oxymoron. Indeed, many of the most successful tyrants in modern history have been delusional; notable examples include Hitler, Stalin, Mao and, more recently, Erdogan. (By way of contrast, I view most other Soviet/Russian leaders and most jumped-up-colonel coup leaders as having been basically sane.)

There are many candidates for what to focus on, including:

  • Technology-specific issues — e.g. privacy/surveillance, network neutrality, etc.
  • Issues in which technology plays a large role — e.g. economic changes that affect many people’s employment possibilities.
  • Subjects that may not be tech-specific, but are certainly of great importance. The list of candidates here is almost endless, such as health care, denigration of women, maltreatment of immigrants, or the possible breakdown of the whole international order.

But please don’t just go on with your life and leave the politics to others. Those “others” you’d like to rely on haven’t been doing a very good job.

What I’ve chosen to do personally includes:

  • Get and stay current in my own knowledge. That’s of course a prerequisite for everything else.
  • Raise consciousness among my traditional audience. This post is an example. :)
  • Educate my traditional audience. Some of you are American, well-versed in history and traditional civics. Some of you are American, but not so well-versed. Some of you are from a broad variety of other countries. The sweet spot of my target is the smart, rational, not-so-well-versed Americans. But I hope others are interested as well.
  • Prepare for such time as nuanced policy analysis is again appropriate. In the past, I’ve tried to make thoughtful, balanced, compromise suggestions for handling thorny issues such as privacy/surveillance or network neutrality. In this time of crisis, people don’t care, and I don’t blame them at all. But hopefully this ill wind will pass, and serious policy-making will restart. When it does, we should be ready for it.
  • Support my family in whatever they choose to do. It’s a small family, but it includes some stars, more articulate and/or politically experienced than I am.

Your choices will surely differ (and later on I will offer suggestions as to what those choices might be). But if you take only one thing from this post and its hopefully many sequels, please take this: Ignoring politics is no longer a rational choice.

Related links

This is my first politics/policy-related post since the start of the Trump (or Trump/Bannon) Administration. I’ll keep a running guide to others here, and in the comments below.

  • The technology industry in particular is now up to its neck in politics. I gave quite a few examples to show why for tech folks there’s no escaping politics now.
  • Some former congressional staffers put out a great guide to influencing your legislators. It’s focused on social justice and anti-discrimination kinds of issues, but can probably be applied more broadly, e.g. to Senator Feinstein’s (D-Cal) involvement in overseeing the intelligence community.
Categories: Other

Introduction to Crate.io and CrateDB

DBMS2 - Sat, 2016-12-17 23:27

Crate.io and CrateDB basics include:

  • Crate.io makes CrateDB.
  • CrateDB is a quasi-RDBMS designed to receive sensor data and similar IoT (Internet of Things) inputs.
  • CrateDB’s creators were perhaps a little slow to realize that the “R” part was needed, but are playing catch-up in that regard.
  • Crate.io is an outfit founded by Austrian guys, headquartered in Berlin, that is turning into a San Francisco company.
  • Crate.io says it has 22 employees and 5 paying customers.
  • Crate.io cites bigger numbers than that for confirmed production users, clearly active clusters, and overall product downloads.

In essence, CrateDB is an open source and less mature alternative to MemSQL. The opportunity for MemSQL and CrateDB alike exists in part because analytic RDBMS vendors didn’t close it off.

CrateDB’s not-just-relational story starts:

  • A column can contain ordinary values (of usual-suspect datatypes) or “objects”, …
  • … where “objects” presumably are the kind of nested/hierarchical structures that are common in the NoSQL/internet-backend world, …
  • … except when they’re just BLOBs (Binary Large OBjects).
  • There’s a way to manually define “strict schemas” on the structured objects, and a syntax for navigating their structure in WHERE clauses.
  • There’s also a way to automagically infer “dynamic schemas”, but it’s simplistic enough to be more suitable for development/prototyping than for serious production.

Crate gave an example of data from >800 kinds of sensors being stored together in a single table. This leads to significant complexity in the FROM clauses. But querying the same data in a relational schema would be at least as complicated, and probably worse.

One key to understanding Crate’s architectural choices is to note that they’re willing to have different latency/consistency standards for:

  • Writes and single-row look-ups.
  • Aggregates and joins.

And so it makes sense that:

  • Data is banged into CrateDB in a NoSQL-ish kind of way as it arrives, with RYW consistency.
  • The indexes needed for SQL functionality are updated in microbatches as soon as possible thereafter. (Think 100 milliseconds as a base case.) Crate.io characterizes the consistency for this part as “eventual”.

CrateDB will never have real multi-statement transactions, but it has simpler levels of isolation that may be called “transactions” in some marketing contexts.

CrateDB technical highlights include:

  • CrateDB records are stored as JSON documents. (Actually, I didn’t ask whether this was true JSON or rather something “JSON-like”.)
    • In the purely relational case, the documents may be regarded as glorified text strings.
    • I got the impression that BLOB storage was somewhat separate from the rest.
  • CrateDB’s sharding story starts with consistent hashing.
    • Shards are physical-only. CrateDB lacks the elasticity-friendly feature of there being many logical shards for each physical shard.
    • However, you can change your shard count, and any future inserts will go into the new set of shards.
  • In line with its two consistency models, CrateDB also has two indexing strategies.
    • Single-row/primary-key lookups have a “forward lookup” index, whatever that is.
    • Tables also have a columnar index.
      • More complex queries and aggregations are commonly done straight against the columnar index, rather than the underlying data.
      • CrateDB’s principal columnar indexing strategy sounds a lot like inverted-list, which in turn is a lot like standard text indexing.
      • Specific datatypes — e.g. geospatial — can be indexed in different ways.
    • The columnar index is shard-specific, and located at the same node as the shard.
    • At least the hotter parts of the columnar index will commonly reside in memory. (I didn’t ask whether this was via straightforward caching or some more careful strategy.)
  • While I didn’t ask about CrateDB’s replication model in detail, I gathered that:
    • Data is written synchronously to all nodes. (That’s sort of implicit in RYW consistency anyway.)
    • Common replication factors are either 1 or 3, depending on considerations such as the value of the data. But as is usual, some tables can be replicated across all nodes.
    • Data can be read from all replicas, for obvious reasons of performance.
  • Where relevant — e.g. the wire protocol or various SQL syntax specifics — CrateDB tends to emulate Postgres.
  • The CrateDB stack includes Elasticsearch and Lucene, both of which make sense in connection with Crate’s text/document orientation.

Crate.io is proud of its distributed/parallel story.

  • Any CrateDB node can plan a query. Necessary metadata for that is replicated across the cluster.
  • Execution starts on a shard-by-shard basis. Data is sorted at each shard before being sent onward.
  • Crate.io encourages you to run Spark and CrateDB on the same nodes.
    • This is supported by parallel Spark-CrateDB integration of the obvious kind.
    • Crate.io notes a happy synergy to this plan, in that Spark stresses CPU while CrateDB is commonly I/O-bound.

The CrateDB-Spark integration was the only support I could find for various marketing claims about combining analytics with data management.

Given how small and young Crate.io is, there are of course many missing features in CrateDB. In particular:

  • A query can only reshuffle data once. Hence, CrateDB isn’t currently well-designed for queries that join more than 2 tables together.
  • The only join strategy currently implemented is nested loop. Others are in the future.
  • CrateDB has most of ANSI SQL 92, but little or nothing specific to SQL 99. In particular, SQL windowing is under development.
  • Geo-distribution is still under development (even though most CrateDB data isn’t actually about people).
  • I imagine CrateDB administrative tools are still rather primitive.

In any case, creating a robust DBMS is an expensive and time-consuming process. Crate has a long road ahead of it.

Categories: Other

Command Line and Vim Tips from a Java Programmer

I’m always interested in learning more about useful development tools. In college, most programmers get an intro to the Linux command line environment, but I wanted to share some commands I use daily that I’ve learned since graduation.

Being comfortable on the command line is a great skill to have when a customer is looking over your shoulder on a Webex. They could be watching a software demo or deployment to their environment. It can also be useful when learning a new code base or working with a product with a large, unfamiliar directory structure with lots of logs.

If you’re on Windows, you can use Cygwin to get a Unix-like CLI to make these commands available.

Useful Linux commands Find

The command find helps you find files by recursively searching subdirectories. Here are some examples:

find .
    Prints all files and directories under the current directory.

find . -name '*.log'
  Prints all files and directories that end in “.log”.

find /tmp -type f -name '*.log'
   Prints only files in the directory “/tmp” that end in “.log”.

find . -type d
   Prints only directories.

find . -maxdepth 2
     Prints all files and directories under the current directory, and subdirectories (but not sub-subdirectories).

find . -type f -exec ls -la {} \;
     The 
-exec
flag runs a command against each file instead of printing the name. In this example, it will run 
ls -la filename
  on each file under the current directory. The curly braces take the place of the filename.

Grep

The command grep lets you search text for lines that match a specific string. It can be helpful to add your initials to debug statements in your code and then grep for them to find them in the logs.

grep foo filename
  Prints each line in the file “filename” that matches the string “foo”.

grep foo\\\|bar filename
Grep supports regular expressions, so this prints each line in the file that matches “foo” or “bar”.

grep -i foo filename
  Add -i for case insensitive matching.

grep foo *
  Use the shell wildcard, an asterisk, to search all files in the current directory for the string “foo”.

grep -r foo *
  Recursively search all files and directories in the current directory for a string.

grep -rnH foo filename
  Add -n to print line numbers and -H to print the filename on each line.

find . -type f -name '*.log' -exec grep -nH foo {} \;
  Combining find and grep can let you easily search each file that matches a certain name for a string. This will print each line that matches “foo” along with the file name and line number in each file that ends in “.log” under the current directory.

ps -ef | grep processName
  The output of any command can be piped to grep, and the lines of STDOUT that match the expression will be printed. For example, you could use this to find the pid of a process with a known name.

cat file.txt | grep -v foo
  You can also use -v to print all lines that don’t match an expression.

Ln

The command ln lets you create links. I generally use this to create links in my home directory to quickly cd into long directory paths.

ln -s /some/really/long/path foo
  The -s is for symbolic, and the long path is the target. The output of
ls -la
 in this case would be
foo -> /some/really/long/path
 .

Bashrc

The Bashrc is a shell script that gets executed whenever Bash is started in an interactive terminal. It is located in your home directory,

~/.bashrc
 . It provides a place to edit your $PATH, $PS1, or add aliases and functions to simplify commonly used tasks.

Aliases are a way you can define your own command line commands. Here are a couple useful aliases I’ve added to my .bashrc that have saved a lot of keystrokes on a server where I’ve installed Oracle WebCenter:

WC_DOMAIN=/u01/oracle/fmw/user_projects/domains/wc_domain
alias assets="cd /var/www/html"
alias portalLogs="cd $WC_DOMAIN/servers/WC_Spaces/logs"
alias domain="cd $WC_DOMAIN"
alias components="cd $WC_DOMAIN/ucm/cs/custom"
alias rpl="portalLogs; vim -R WC_Spaces.out"

After making changes to your .bashrc, you can load them with

source ~/.bashrc
 . Now I can type
rpl
 , short for Read Portal Logs, from anywhere to quickly jump into the WebCenter portal log file.

alias grep=”grep --color”

This grep alias adds the –color option to all of my grep commands.  All of the above grep commands still work, but now all of the matches will be highlighted.

Vim

Knowing Vim key bindings can be convenient and efficient if you’re already working on the command line. Vim has many built-in shortcuts to make editing files quick and easy.

Run 

vim filename.txt
  to open a file in Vim. Vim starts in Normal Mode, where most characters have a special meeting, and typing a colon,
:
 , lets you run Vim commands. For example, typing 
Shift-G
  will jump to the end of the file, and typing
:q
 while in normal mode will quit Vim. Here is a list of useful commands:

:q
  Quits Vim

:w
  Write the file (save)

:wq
  Write and quit

:q!
  Quit and ignore warnings that you didn’t write the file

:wq!
  Write and quit, ignoring permission warnings

i
  Enter Insert Mode where you can edit the file like a normal text editor

a
  Enter Insert Mode and place the cursor after the current character

o
  Insert a blank line after the current line and enter Insert Mode

[escape]
  The escape button exits insert mode

:150
  Jump to line 150

shift-G
  Jump to the last line

gg
  Jump to the first line

/foo
  Search for the next occurrence of “foo”. Regex patterns work in the search.

?foo
  Search for the previous occurrence of “foo”

n
  Go to the next match

N
Go to the previous match

*
  Search for the next occurrence of the searched word under the cursor

#
  Search for the previous occurrence of the searched word under the cursor

w
  Jump to the next word

b
  Jump to the previous word

``
  Jump to the last action

dw
  Delete the word starting at the cursor

cw
  Delete the word starting at the cursor and enter insert mode

c$
  Delete everything from the cursor to the end of the line and enter insert mode

dd
  Delete the current line

D
  Delete everything from the cursor to the end of the line

u
  Undo the last action

ctrl-r
 
ctrl-r
  Redo the last action

d[up]
  Delete the current line and the line above it. “[up]” is for the up arrow.

d[down]
  Delete the current line and the line below it

d3[down]
  Delete the current line and the three lines below it

r[any character]
  Replace the character under the cursor with another character

~
  Toggle the case (upper or lower) of the character under the cursor

v
  Enter Visual Mode. Use the arrow keys to highlight text.

shift-V
  Enter Visual Mode and highlight whole lines at a time.

ctrl-v
  Enter Visual Mode but highlight blocks of characters.

=
  While in Visual Mode, = will auto format highlighted text.

c
  While in Visual Mode, c will cut the highlighted text.

y
  While in Visual Mode, y will yank (copy) the highlighted text.

p
  In Normal Mode, p will paste the text in the buffer (that’s been yanked or cut).

yw
  Yank the text from the cursor to the end of the current word.

:sort
  Highlight lines in Visual Mode, then use this command to sort them alphabetically.

:s/foo/bar/g
  Highlight lines in Visual Mode, then use search and replace to replace all instances of “foo” with “bar”.

:s/^/#/
  Highlight lines in Visual Mode, then add # at the start of each line. This is useful to comment out blocks of code.

:s/$/;/
Highlight lines in Visual Mode, then add a semicolon at the end of each line.

:set paste
  This will turn off auto indenting. Use it before pasting into Vim from outside the terminal (you’ll want to be in insert mode before you paste).

:set nopaste
  Make auto indenting return to normal.

:set nu
  Turn on line numbers.

:set nonu
  Turn off line numbers.

:r!pwd
  Read the output of a command into Vim. In this example, we’ll read in the current directory.

:r!sed -n 5,10p /path/to/file
  Read lines 5 through 10 from another file in Vim. This can be a good way to copy and paste between files in the terminal.

:[up|down]
  Type a colon and then use the arrow keys to browse through your command history. If you type letters after the colon, it will only go through commands that matched that. (i.e., :se  and then up would help find to “:set paste” quickly).

Vimrc

The Vimrc is a configuration file that Vim loads whenever it starts up, similar to the Bashrc. It is in your home directory.

Here is a basic Vimrc I’d recommend for getting started if you don’t have one already. Run

vim ~/.vimrc
and paste in the following:

set backspace=2         " backspace in insert mode works like normal editor
syntax on               " syntax highlighting
filetype indent on      " activates indenting for files
set autoindent          " auto indenting
set number              " line numbers
colorscheme desert      " colorscheme desert
set listchars=tab:>-,trail:.,extends:>,precedes:<
set list                " Set up whitespace characters
set ic                  " Ignore case by default in searches
set statusline+=%F      " Show the full path to the file
set laststatus=2        " Make the status line always visible

 

Perl

Perl comes installed by default on Linux, so it is worth mentioning that it has some extensive command line capabilities. If you have ever tried to grep for a string that matches a line in a minified Javascript file, you can probably see the benefit of being able to filter out lines longer than 500 characters.

grep -r foo * | perl -nle'print if 500 > length'

Conclusion

I love learning the tools that are available in my development environment, and it is exciting to see how they can help customers as well.

Recently, I was working with a customer and we were running into SSL issues. Java processes can be run with the option 

-Djavax.net.ssl.trustStore=/path/to/trustStore.jks
  to specify which keystore to use for SSL certificates. It was really easy to run
ps -ef | grep trustStore
to quickly identify which keystore we needed to import certificates into.

I’ve also been able to use various find and grep commands to search through unfamiliar directories after exporting metadata from Oracle’s MDS Repository.

Even if you aren’t on the command line, I’d encourage everyone to learn something new about their development environment. Feel free to share your favorite Vim and command line tips in the comments!

Further reading

http://www.vim.org/docs.php

https://www.gnu.org/software/bash/manual/bash.html

http://perldoc.perl.org/perlrun.html

The post Command Line and Vim Tips from a Java Programmer appeared first on Fishbowl Solutions' C4 Blog.

Categories: Fusion Middleware, Other

Webinar Recording: Ryan Companies Leverages Fishbowl’s ControlCenter for Oracle WebCenter to Enhance Document Control Leading to Improved Knowledge Management

On Thursday, December 8th, Fishbowl had the privilege of presenting a webinar with Mike Ernst – VP of Contruction Operations – at Ryan Companies regarding their use case for Fishbowl’s ControlCenter product for controlled document management. Mike was joined by Fishbowl’s ControlCenter product manager, Kim Negaard, who provided an overview of how the solution was implemented and how it is being used at Ryan.

Ryan Companies had been using Oracle WebCenter for many years, but they were looking for some additional document management functionality and a more intuitive interface to help improve knowledge management at the company. Their main initiative was to make it easier for users to access and manage their corporate knowledge documents (policies and procedures), manuals (safety), and real estate documents (leases) throughout each document’s life cycle.

Mike provided some interesting stats that factored into their decision to implement ControlCenter for WebCenter:

  • $16k – the average cost of “reinventing” procedures per project (ex. checklists and templates)
  • $25k – the average cost of estimating incorrect labor rates
  • 3x – salary to onboard someone new when an employee leaves the company

To hear more about how Ryan found knowledge management success with ControlCenter for WebCenter, watch the webinar recording: https://youtu.be/_NNFRV1LPaY

The post Webinar Recording: Ryan Companies Leverages Fishbowl’s ControlCenter for Oracle WebCenter to Enhance Document Control Leading to Improved Knowledge Management appeared first on Fishbowl Solutions' C4 Blog.

Categories: Fusion Middleware, Other

DBAs of the future

DBMS2 - Wed, 2016-11-23 06:02

After a July visit to DataStax, I wrote

The idea that NoSQL does away with DBAs (DataBase Administrators) is common. It also turns out to be wrong. DBAs basically do two things.

  • Handle the database design part of application development. In NoSQL environments, this part of the job is indeed largely refactored away. More precisely, it is integrated into the general app developer/architect role.
  • Manage production databases. This part of the DBA job is, if anything, a bigger deal in the NoSQL world than in more mature and automated relational environments. It’s likely to be called part of “devops” rather than “DBA”, but by whatever name it’s very much a thing.

That turns out to understate the core point, which is that DBAs still matter in non-RDBMS environments. Specifically, it’s too narrow in two ways.

  • First, it’s generally too narrow as to what DBAs do; people with DBA-like skills are also involved in other areas such as “data governance”, “information lifecycle management”, storage, or what I like to call data mustering.
  • Second — and more narrowly :) — the first bullet point of the quote is actually incorrect. In fact, the database design part of application development can be done by a specialized person up front in the NoSQL world, just as it commonly is for RDBMS apps.

My wake-up call for that latter bit was a recent MongoDB 3.4 briefing. MongoDB certainly has various efforts in administrative tools, which I won’t recapitulate here. But to my surprise, MongoDB also found a role for something resembling relational database design. The idea is simple: A database administrator defines a view against a MongoDB database, where views:

  • Are logical rather than materialized. (At least at this time.)
  • Have their permissions and so on set by the DBA.
  • Are the sole thing the programmer writes against.

Besides the obvious benefits in development ease and security, MongoDB says that performance can be better as well.* This is of course a new feature, without a lot of adoption at this time. Even so, it seems likely that NoSQL doesn’t obsolete any part of the traditional DBA role.

*I didn’t actually ask what a naive programmer can do to trash performance that views can forestall, but … well, I was once a naive programmer myself. :)

Two trends that I think could make DBA’s lives even more interesting and challenging in the future are:

  • The integration of quick data management into complex analytic processes. Here by “quick data management” I mean, for example, what you do in connection with a complex Hadoop or Spark (set of) job(s). Leaving the data management to a combination of magic and Python scripts doesn’t seem to respect how central data operations are to analytic tasks.
  • The integration of data management and streaming. I should probably write about this point separately, but in any case — it seems that streaming stacks will increasingly look like over-caffeinated DBMS.

Bottom line: Database administration skills will be needed for a long time to come.

Categories: Other

MongoDB 3.4 and “multimodel” query

DBMS2 - Wed, 2016-11-23 06:01

“Multimodel” database management is a hot new concept these days, notwithstanding that it’s been around since at least the 1990s. My clients at MongoDB of course had to join the train as well, but they’ve taken a clear and interesting stance:

  • A query layer with multiple ways to query and analyze data.
  • A separate data storage layer in which you have a choice of data storage engines …
  • … each of which has the same logical (JSON-based) data structure.

When I pointed out that it would make sense to call this “multimodel query” — because the storage isn’t “multimodel” at all — they quickly agreed.

To be clear: While there are multiple ways to read data in MongoDB, there’s still only one way to write it. Letting that sink in helps clear up confusion as to what about MongoDB is or isn’t “multimodel”. To spell that out a bit further:

  • In query, MongoDB mixes multiple paradigms for DML (Data Manipulation Language). The main one is of course JSON.
  • When writing, the DML paradigm is unmixed — it’s just JSON.

Further, MongoDB query DML statements can be mixed with analytic functions rooted in Spark.

The main ways to query data in MongoDB, to my knowledge, are:

  • Native/JSON. Duh.
  • SQL.
    • MongoDB has used MySQL as a guide to what SQL coverage they think the market is calling for.
    • More to the point, they’re trying to provide enough SQL so that standard business intelligence tools work well (enough) against MongoDB.
    • I neglected to ask why this changed from MongoDB’s adamantly non-SQL approach of 2 1/2 years ago.
  • Search.
    • MongoDB has been adding text search features for a few releases.
    • MongoDB’s newest search feature revolves around “facets”, in the Endeca sense of the term. MongoDB characterizes as a kind of text-oriented GroupBy.
  • Graph. MongoDB just introduced a kind of recursive join capability, which is useful for detecting multi-hop relationships (e.g. ancestor/descendant rather than just parent/child). MongoDB declares that the “graph” box is thereby checked. :)

Three years ago, in an overview of layered and multi-DML architectures, I suggested:

  • Layered DBMS and multimodel functionality fit well together.
  • Both carried performance costs.
  • In most cases, the costs could be affordable.

MongoDB seems to have bought strongly into that view on the query side — which is, of course, exactly the right way for them to have started.

Categories: Other

Webinar: Quality, Safety, Knowledge Management with Oracle WebCenter Content and ControlCenter

DATE: THURSDAY, DECEMBER 8, 2016
TIME: 10:00 A.M. PST / 1:00 P.M. EST

Join Ryan Companies Vice President of Construction Operations, Mike Ernst, and Fishbowl Solutions Product Manager, Kim Negaard, to learn how Ryan Companies, a leading national construction firm, found knowledge management success with ControlCenter for Oracle WebCenter Content.

In this webinar, you’ll hear first-hand how ControlCenter has been implemented as part of Ryan’s Integrated Project Delivery Process helping them create a robust knowledge management system to promote consistent and effective operations across multiple regional offices. You’ll also learn how ControlCenter’s intuitive, modern user experience enabled Ryan to easily find documents across devices, implement reoccurring review cycles, and control both company-wide and project-specific documents throughout their lifecycle.

Register today.

Register

 

 

The post Webinar: Quality, Safety, Knowledge Management with Oracle WebCenter Content and ControlCenter appeared first on Fishbowl Solutions' C4 Blog.

Categories: Fusion Middleware, Other

Approaches to Consider for Your Organization’s Windchill Consolidation Project

This post comes from Fishbowl Solutions’ Senior Solutions Architect, Seth Richter.

More and more organizations need to merge multiple Windchill instances into a single Windchill instance after either acquiring another company or maybe had separate Windchill implementations based on old divisional borders. Whatever the situation, these organizations want to merge into a single Windchill instance to gain efficiencies and/or other benefits.

The first task for a company in this situation is to assemble the right team and develop the right plan. The team will need to understand the budget and begin to document key requirements and its implications. Will they hire an experienced partner like Fishbowl Solutions? If so, we recommend involving the partner early on in the process so they can help navigate the key decisions, avoid pitfalls and develop the best approach for success.

Once you start evaluating the technical process and tools to merge the Windchill instances, the most likely options are:

1. Manual Method

Moving data from one Windchill system to another manually is always an option. This method might be viable if there are small pockets of data to move in an ad-hoc manner. However, this method is extremely time consuming so proceed with caution…if you get halfway through and then move to a following method then you might have hurt the process rather than help it.

2. Third Party Tools (Fishbowl Solutions LinkExtract & LinkLoader tools)

This process can be a cost effective alternative, but it is not as robust as the Windchill Bulk Migrator so your requirements might dictate if this is viable or not.

3. PTC Windchill Bulk Migrator (WBM) tool

This is a powerful, complex tool that works great if you have an experienced team running it. Fishbowl prefers the PTC Windchill Bulk Migrator in many situations because it can complete large merge projects over a weekend and historical versions are also included in the process.

A recent Fishbowl project involved a billion-dollar manufacturing company who had acquired another business and needed to consolidate CAD data from one Windchill system into their own. The project had an aggressive timeline because it needed to be completed before the company’s seasonal rush (and also be prepared for an ERP integration). During the three-month project window, we kicked off the project, executed all of the test migrations and validations, scheduled a ‘go live’ date, and then completed the final production migration over a weekend. Users at the acquired company checked their data into their “old” Windchill system on a Friday and were able check their data out of the main corporate instance on Monday with zero engineer downtime.

Fishbowl Solutions’ PTC/PLM team has completed many Windchill merge projects such as this one. The unique advantage of working with Fishbowl is that we are  PTC Software Partners and Windchill programming experts. Often times, when other reseller/consulting partners get stuck waiting on PTC technical support, Fishbowl has been able to problem solve and keep projects on time and on budget.

If your organization is seeking to find an effective and efficient way to bulk load data from one Windchill system to another, our experts at Fishbowl Solutions are able to accomplish this on time and on budget. Urgency is a priority in these circumstances, and we want to ensure you’re able to make this transition process as hassle-free as possible with no downtime. Not sure which tool is the best fit for your Windchill migration project? Check out our website, click the “Contact Us” tab, or reach out to Rick Passolt in our business development department for more information or to request a demo.

Contact Us

Rick Passolt
Senior Account Executive
952.456.3418
mcadsales@fishbowlsolutions.com

Seth Richter is a Senior Solutions Architect at Fishbowl Solutions. Fishbowl Solutions was founded in 1999. Their areas of expertise include Oracle WebCenter, PTC’s Product Development System (PDS), and enterprise search solutions using the Google Search Appliance. Check out our website to learn more about what we do.

The post Approaches to Consider for Your Organization’s Windchill Consolidation Project appeared first on Fishbowl Solutions' C4 Blog.

Categories: Fusion Middleware, Other

Consider Your Options for SolidWorks to Windchill Data Migrations

This post comes from Fishbowl Solutions’ Associate MCAD Consultant, Ben Sawyer.

CAD data migrations are most often seen as a huge burden. They can be lengthy, costly, messy, and a general road block to a successful project. Organizations planning on migrating SolidWorks data to PTC Windchill should consider their options when it comes to the process and tools they utilize to perform the bulk loading.

At Fishbowl Solutions, our belief is that the faster you can load all your data accurately into Windchill, the faster your company can implement critical PLM business processes and realize the results of such initiatives like a Faster NPI, Streamline Change & Configuration Management, Improved Quality, Etc.

There are two typical project scenarios we encounter with these kinds of data migration projects. SolidWorks data resides on a Network File System (NFS) or resides in either PDMWorks or EPDM.

The options for this process and the tools used will be dependent on other factors as well. The most common guiding factors to influence decisions are the quantity of data and the project completion date requirements. Here are typical project scenarios.

Scenario One: Files on a Network File System

Manual Migration

There is always an option to manually migrate SolidWorks data into Windchill. However, if an organization has thousands of files from multiple products that need to be imported, this process can be extremely daunting. When loading manually, this process involves bringing files into the Windchill workspace, carefully resolving any missing dependents, errors, duplicates, setting destination folders, revisions, lifecycles and fixing bad metadata. (Those who have tried this approach with large data quantities in the past know the pain of which we are talking about!)

Automated Solution

Years ago, Fishbowl developed its LinkLoader tool for SolidWorks as a viable solution to complete a Windchill bulk loading project with speed and accuracy.

Fishbowl’s LinkLoader solution follows a simple workflow to help identify data to be cleansed and mass loaded with accurate metadata. The steps are as follows:

1. Discovery
In this initial stage, the user chooses the mass of SolidWorks data to be loaded into Windchill. Since Windchill doesn’t allow duplicate named CAD files in the system, the software quickly identifies these duplicate files. It is up to the user to resolve the duplicate files or remove them from the data loading set.

2. Validation
The validation stage will ensure files are retrievable, attributes/parameters are extracted (for use in later stages), and relationships with other SolidWorks files are examined. LinkLoader captures all actions. The end user will need to resolve any errors or remove the data from the loading set.

3. Mapping
Moving toward the bulk loading stage, it is necessary to confirm and/or modify the attribute-mapping file as desired. The only required fields for mapping are lifecycle, revision/version, and the Windchill folder location. End users are able to leverage the attributes/parameter information from the validation as desired, or create their own ‘Instance Based Attribute’ list to map with the files.

4. Bulk Load
Once the mapping stage is completed, the loading process is ready. There is a progress indicator that displays the number of files completed and the percentage done. If there are errors with any files during the upload, it will document these in an ‘Error List Report’ and LinkLoader will simply move on to the next file.

Scenario Two: Files reside in PDMWorks or EPDM

Manual Migration

There is also an option to do a manual data migration from one system to another if files reside in PDMWorks or EPDM. However, this process can also be tedious and drawn out as much, or perhaps even more than when the files are on a NFS.

Automated Solution

Having files within PDMWorks or EPDM can make the migration process more straightforward and faster than the NFS projects. Fishbowl has created an automated solution tool that extracts the latest versions of each file from the legacy system and immediately prepares it for loading into Windchill. The steps are as follows:

1. Extraction (LinkExtract)
In this initial stage, Fishbowl uses its LinkExtract tool to pull the latest version of all SolidWorks files , determine references, and extract all the attributes for the files as defined in PDMWorks or EPDM.

2. Mapping
Before loading the files, it is necessary to confirm and or modify the attribute mapping file as desired. Admins can fully leverage the attributes/parameter information from the Extraction step, or can start from scratch if they find it to be easier. Often the destination Windchill system will have different terminology or states and it is easy to remap those as needed in this step.

3. Bulk Load
Once the mapping stage is completed, the loading process is ready. There is a progress indicator that displays the number of files completed and the percentage done. If there are errors with any files during the upload, it will document these in the Error List Report and LinkLoader will move on to the next file.

Proven Successes with LinkLoader

Many of Fishbowl’s customers have purchased and successfully ran LinkLoader themselves with little to no assistance from Fishbowl. Other customers of ours have utilized our consulting services to complete the migration project on their behalf.

With Fishbowl’s methodology centered on “Customer First”, our focus and support continuously keeps our customers satisfied. This is the same commitment and expertise we will bring to any and every data migration project.

If your organization is looking to consolidate SolidWorks CAD data to Windchill in a timely and effective manner, regardless of the size and scale of the project, our experts at Fishbowl Solutions can get it done.

For example, Fishbowl partnered with a multi-billion dollar medical device company with a short time frame to migrate over 30,000 SolidWorks files from a legacy system into Windchill. Fishbowl’s expert team took initiative and planned the process to meet their tight industry regulations and finish on time and on budget. After the Fishbowl team executed test migrations, the actual production migration process only took a few hours, thus eliminating engineering downtime.

If your organization is seeking the right team and tools to complete a SolidWorks data migration to Windchill, reach out to us at Fishbowl Solutions.

If you’d like more information about Fishbowl’s LinkLoader tool or our other products and services for PTC Windchill and Creo, check out our website, click the “Contact Us” tab, or reach out to Rick Passolt in our business development department.

Contact Us

Rick Passolt
Senior Account Executive
952.465.3418
mcadsales@fishbowlsolutions.com

Ben Sawyer is an Associate MCAD Consultant at Fishbowl Solutions. Fishbowl Solutions was founded in 1999. Their areas of expertise include Oracle WebCenter, PTC’s Product Development System (PDS), and enterprise search solutions using the Google Search Appliance. Check out our website to learn more about what we do. 

The post Consider Your Options for SolidWorks to Windchill Data Migrations appeared first on Fishbowl Solutions' C4 Blog.

Categories: Fusion Middleware, Other

Rapid analytics

DBMS2 - Fri, 2016-10-21 09:17

“Real-time” technology excites people, and has for decades. Yet the actual, useful technology to meet “real-time” requirements remains immature, especially in cases which call for rapid human decision-making. Here are some notes on that conundrum.

1. I recently posted that “real-time” is getting real. But there are multiple technology challenges involved, including:

  • General streaming. Some of my posts on that subject are linked at the bottom of my August post on Flink.
  • Low-latency ingest of data into structures from which it can be immediately analyzed. That helps drive the (re)integration of operational data stores, analytic data stores, and other analytic support — e.g. via Spark.
  • Business intelligence that can be used quickly enough. This is a major ongoing challenge. My clients at Zoomdata may be thinking about this area more clearly than most, but even they are still in the early stages of providing what users need.
  • Advanced analytics that can be done quickly enough. Answers there may come through developments in anomaly management, but that area is still in its super-early days.
  • Alerting, which has been under-addressed for decades. Perhaps the anomaly management vendors will finally solve it.

2. In early 2011, I coined the phrase investigative analytics, about which I said three main things:

  • It is meant to contrast with “operational analytics”.
  • It is meant to conflate “several disciplines, namely”:
    • Statistics, data mining, machine learning, and/or predictive analytics.
    • The more research-oriented aspects of business intelligence tools.
    • Analogous technologies as applied to non-tabular data types such as text or graph.
  • A simple definition would be “Seeking (previously unknown) patterns in data.”

Generally, that has held up pretty well, although “exploratory” is the more widely used term. But the investigative/operational dichotomy obscures one key fact, which is the central point of this post: There’s a widespread need for very rapid data investigation.

3. This is not just a niche need. There are numerous rapid-investigation use cases in mind, some already mentioned in my recent posts on anomaly management and real-time applications.

  • Network operations. This is my paradigmatic example.
    • Data is zooming all over the place, in many formats and structures, among many kinds of devices. That’s log data, header data and payload data alike. Many kinds of problems can arise …
    • … which operators want to diagnose and correct, in as few minutes as possible.
    • Interfaces commonly include real-time business intelligence, some drilldown, and a lot of command-line options.
    • I’ve written about various specifics, especially in connection with the vendors Splunk and Rocana.
  • Security and anti-fraud. Infosec and cyberfraud, to a considerable extent, are just common problems in network operations. Much of the response is necessarily automated — but the bad guys are always trying to outwit your automation. If you think they may have succeeded, you want to figure that out very, very fast.
  • Consumer promotion and engagement. Consumer marketers feel a great need for speed. Some of it is even genuine. :)
    • If an online promotion is going badly (or particularly well), they can in theory react almost instantly. So they’d like to know almost instantly, perhaps via BI tools with great drilldown.
    • The same is even truer in the case of social media eruptions and the like. Obviously, the tools here are heavily text-oriented.
    • Call centers and even physical stores have some of the same aspects as internet consumer operations.
  • Consumer internet backends, for e-commerce, publishing, gaming or whatever. These cases combine and in some cases integrate the previous three points. For example, if you get a really absurd-looking business result, that could be your first indication of network malfunctions or automated fraud.
  • Industrial technology, such as factory operations, power/gas/water networks, vehicle fleets or oil rigs. Much as in IT networks, these contain a diversity of equipment — each now spewing its own logs — and have multiple possible modes of failure. More often than is the case in IT networks, you can recognize danger signs, then head off failure altogether via preventive maintenance. But when you can’t, it is crucial to identify the causes of failure fast.
  • General IoT (Internet of Things) operation. This covers several of the examples above, as well as cases in which you sell a lot of devices, have them “phone home”, and labor to keep that whole multi-owner network working.
  • National security. If I told you what I meant by this one, I’d have to … [redacted].

4. And then there’s the investment industry, which obviously needs very rapid analysis. When I was a stock analyst, I could be awakened by a phone call and told news that I would need to explain to 1000s of conference call listeners 20 minutes later. This was >30 years ago. The business moves yet faster today.

The investment industry has invested greatly in high-speed supporting technology for decades. That’s how Mike Bloomberg got so rich founding a vertical market tech business. But investment-oriented technology indeed remains a very vertical sector; little of it get more broadly applied.

I think the reason may be that investing is about guesswork, while other use cases call for more definitive answers. In particular:

  • If you’re wrong 49.9% of the time in investing, you might still be a big winner.
  • In high-frequency trading, speed is paramount; you have to be faster than your competitors. In speed/accuracy trade-offs, speed wins.

5. Of course, it’s possible to overstate these requirements. As in all real-time discussions, one needs to think hard about:

  • How much speed is important in meeting users’ needs.
  • How much additional speed, if any, is important in satisfying users’ desires.

But overall, I have little doubt that rapid analytics is a legitimate area for technology advancement and growth.

Categories: Other

What I Have Learned as an Oracle WebCenter Consultant in My First Three Months at Fishbowl Solutions

This post comes from Fishbowl Solutions’ Associate Software Consultant, Jake Jehlicka.

Finishing college can be an intimidating experience for many. We leave what we know behind to open the gates to brand new experiences. Those of us fortunate enough to gain immediate employment often find ourselves leaving school and plunging headfirst into an entirely new culture a mere few weeks after turning in our last exam. It is exciting, yet frightening, and what can make-or-break the whole experience is the new environment in which you find yourself if. I consider myself one of the lucky ones.

Intern shirt-back

I have been with Fishbowl Solutions for just over three months, and the experience is unlike any that I had encountered in my previous internships, work, or schooling in Duluth. I moved to the Twin Cities within a week of accepting the position. I was terrified, but my fears were very soon laid to rest. Fishbowl welcomed me with open arms, and I have learned an incredible amount in the short time that I have spent here. Here are just a few of the many aspects of Fishbowl and the skills I’ve gained since working here as an associate software consultant.

Culture

One of the things that really jumped out at me right away is how a company’s culture is a critical component to making work enjoyable and sustainable. Right from the outset, I was invited and even encouraged to take part in Fishbowl’s company activities like their summer softball team and happy hours celebrating new employees joining the team. I have seen first-hand how much these activities bring the workplace together in a way that not only makes employees happy, but makes them very approachable when it comes to questions or assistance. The culture here seems to bring everyone together in a way that is unique to Fishbowl, and the work itself sees benefits because of it.

Teamwork

Over the past three months, one thing that I have also learned is the importance of working together. I joined Fishbowl a few weeks after the other trainees in my group, and they were a bit ahead of me in the training program when I started. Not only were they ready and willing to answer any questions that I had, but they also shared their knowledge that they had acquired in such a way that I was able to catch up before our training had completed. Of course the other trainees weren’t the only ones willing to lend their assistance. The team leads have always been there whenever I needed a technical question answered, or even if I just wanted advice in regard to where my own career may be heading.

People Skills

The team leads also taught me that not every skill is something that can be measured. Through my training, we were exposed to other elements outside of the expected technical skills. We were given guidance when it comes to oft-neglected soft skills such as public speaking and client interactions. These sorts of skills are utterly necessary to learn, regardless of which industry you are in. It is thanks to these that I have already had positive experiences working with our clients.

Technical Skills

As a new software consultant at Fishbowl, I have gained a plethora of knowledge about various technologies and applications, especially with Oracle technologies. The training that I received has prepared me for working with technologies like Oracle WebCenter in such a way that I have been able to dive right into projects as soon as I finished. Working with actual systems was nearly a foreign concept after working with small individual projects in college, but I learned enough from my team members to be able to proceed with confidence. The training program at Fishbowl has a very well-defined structure, with an agenda laid out of what I should be working on in any given time period. A large portion of this was working directly with my own installation of the WebCenter content server. I was responsible for setting up, configuring, and creating a custom code for the servers both in a Windows and Linux environment. The training program was very well documented and I always had the tools, information, and assistance that was needed to complete every task.

Once the formal training ended, I was immediately assigned a customer project involving web development using Oracle’s Site Studio Designer. The training had actually covered this application and I was sufficiently prepared to tackle the new endeavor! With that said, every single day at Fishbowl is another day of education; no two projects are identical and there is always something to be learned. For example, I am currently learning Ext JS with Sencha Architect in preparation for a new project!

Although we may never know with absolute certainty what the future has in store for us, I can confidently say that the experiences, skills, knowledge that I have gained while working at Fishbowl Solutions will stay with me for the rest of my life.

Thank you to the entire Fishbowl team for everything they have done for me, and I look forward to growing alongside them!

j_jehlicka

Jake Jehlicka is an Associate Software Consultant at Fishbowl Solutions. Fishbowl Solutions was founded in 1999. Their areas of expertise include Oracle WebCenter, PTC’s Product Development System (PDS), and enterprise search solutions using the Google Search Appliance. Check out our website to learn more about what we do. 

The post What I Have Learned as an Oracle WebCenter Consultant in My First Three Months at Fishbowl Solutions appeared first on Fishbowl Solutions' C4 Blog.

Categories: Fusion Middleware, Other

Notes on anomaly management

DBMS2 - Mon, 2016-10-10 02:35

Then felt I like some watcher of the skies
When a new planet swims into his ken

— John Keats, “On First Looking Into Chapman’s Homer”

1. In June I wrote about why anomaly management is hard. Well, not only is it hard to do; it’s hard to talk about as well. One reason, I think, is that it’s hard to define what an anomaly is. And that’s a structural problem, not just a semantic one — if something is well enough understood to be easily described, then how much of an anomaly is it after all?

Artificial intelligence is famously hard to define for similar reasons.

“Anomaly management” and similar terms are not yet in the software marketing mainstream, and may never be. But naming aside, the actual subject matter is important.

2. Anomaly analysis is clearly at the heart of several sectors, including:

  • IT operations
  • Factory and other physical-plant operations
  • Security
  • Anti-fraud
  • Anti-terrorism

Each of those areas features one or both of the frameworks:

  • Surprises are likely to be bad.
  • Coincidences are likely to be suspicious.

So if you want to identify, understand, avert and/or remediate bad stuff, data anomalies are the first place to look.

3. The “insights” promised by many analytics vendors — especially those who sell to marketing departments — are also often heralded by anomalies. Already in the 1970s, Walmart observed that red clothing sold particularly well in Omaha, while orange flew off the shelves in Syracuse. And so, in large college towns, they stocked their stores to the gills with clothing in the colors of the local football team. They also noticed that fancy dresses for little girls sold especially well in Hispanic communities … specifically for girls at the age of First Communion.

4. The examples in the previous point may be characterized as noteworthy correlations that surely are reflecting actual causality. (The beer/diapers story would be another example, if only it were true.) Formally, the same is probably true of most actionable anomalies. So “anomalies” are fairly similar to — or at least overlap heavily with — “statistically surprising observations”.

And I do mean “statistically”. As per my Keats quote above, we have a classical model of sudden-shock discovery — an astronomer finding a new planet, a radar operator seeing a blip on a screen, etc. But Keats’ poem is 200 years old this month. In this century, there’s a lot more number-crunching involved.

Please note: It is certainly not the case that anomalies are necessarily found via statistical techniques. But however they’re actually found, they would at least in theory score as positives via various statistical tests.

5. There are quite a few steps to the anomaly-surfacing process, including but not limited to:

  • Collecting the raw data in a timely manner.
  • Identifying candidate signals (and differentiating them from noise).
  • Communicating surprising signals to the most eager consumers (and letting them do their own analysis).
  • Giving more tightly-curated information to a broader audience.

Hence many different kinds of vendor can have roles to play.

6. One vendor that has influenced my thinking about data anomalies is Nestlogic, an early-stage start-up with which I’m heavily involved. Here “heavily involved” includes:

  • I own more stock in Nestlogic than I have in any other company of which I wasn’t the principal founder.
  • I’m in close contact with founder/CEO David Gruzman.
  • I’ve personally written much of Nestlogic’s website content.

Nestlogic’s claims include:

  • For machine-generated data, anomalies are likely to be found in data segments, not individual records. (Here a “segment” might be all the data coming from a particular set of sources in a particular period of time.)
  • The more general your approach to anomaly detection, the better, for at least three reasons:
    • In adversarial use cases, the hacker/fraudster/terrorist/whatever might deliberately deviate from previous patterns, so as to evade detection by previously-established filters.
    • When there are multiple things to discover, one anomaly can mask another, until it is detected and adjusted for.
    • (This point isn’t specific to anomaly management) More general tools can mean that an enterprise has fewer different new tools to adopt.
  • Anomalies boil down to surprising data profiles, so anomaly detection bears a slight resemblance to the data profiling approaches used in data quality, data integration and query optimization.
  • Different anomaly management users need very different kinds of UI. Less technical ones may want clear, simple alerts, with a minimum of false positives. Others may use anomaly management as a jumping-off point for investigative analytics and/or human real-time operational control.

I find these claims persuasive enough to help Nestlogic with its marketing and fund-raising, and to cite them in my post here. Still, please understand that they are Nestlogic’s and David’s assertions, not my own.

Categories: Other

Pages

Subscribe to Oracle FAQ aggregator - Other